Clustering Student Discussion Messages on Online Forumby Visualization and Non-Negative Matrix Factorization

Affiliation(s)

School of Computing and Mathematics, Charles Sturt University, Australia.

Faculty of Education Information Technology, South China Normal University, China.

Faculty of Information and Communication Technologies Swinburne University of Technology, Australia.

School of Computing and Mathematics, Charles Sturt University, Australia.

Faculty of Education Information Technology, South China Normal University, China.

Faculty of Information and Communication Technologies Swinburne University of Technology, Australia.

ABSTRACT

The use of online discussion forum can effectively engage students in their studies.
As the number of messages posted on the forum is increasing, it is more difficult
for instructors to read and respond to them in a prompt way. In this paper, we apply
non-negative matrix factorization and visualization to clustering message data,
in order to provide a summary view of messages that disclose their deep semantic
relationships. In particular, the NMF is able to find the underlying issues hidden
in the messages about which most of the students are concerned. Visualization is
employed to estimate the initial number of clusters, showing the relation communities.
The experiments and comparison on a real dataset have been reported to demonstrate
the effectiveness of the approaches.

Cite this paper

X. Huang, J. Zhao, J. Ash and W. Lai, "Clustering Student Discussion Messages on Online Forumby Visualization and Non-Negative Matrix Factorization,"*Journal of Software Engineering and Applications*, Vol. 6 No. 7, 2013, pp. 7-12. doi: 10.4236/jsea.2013.67B002.

X. Huang, J. Zhao, J. Ash and W. Lai, "Clustering Student Discussion Messages on Online Forumby Visualization and Non-Negative Matrix Factorization,"

References

[1] T. Opsahl, “Triadic Closure in Two-mode Networks: Redefining the Global and Local Clustering Coefficients,” Social Networks, Vol. 35, 2013. doi:10.1016/j.socnet.2011.07.001.

[2] P. D. Laurie and T. Ellis, “Using Data Mining as a Strategy for Assessing Asynchronous Ddiscussion Forums,” Computers & Education, Vol. 45, No. 1, 2005, pp. 141-160. doi:10.1016/j.compedu.2004.05.003

[3] N. Lia and D. D. Wub, “Using Text Mining and Sentiment Analysis for Online Forums Hotspot Detection and Forecast,” Decision Support Systems, Vol. 48, No. 2, 2010, pp. 354-368. doi:10.1016/j.dss.2009.09.003

[4] A. Silva, “Visual Analysis of Online Interactions through Social Network Patterns,” IEEE 12th International Conference on Advanced Learning Technologies (ICALT), 2012, pp. 639- 641.

[5] X. Huang, X. Zheng, W. Yuan, F. Wang and S. Zhu, “Enhanced Clustering of Biomedical Documents Using Ensemble Non-negative Matrix Factorization”, Information Sciences, Vol. 181, No.11, 2011, pp. 2293-2302. doi:10.1016/j.ins.2011.01.029

[6] D. D. Lee, H. S. Seung, “Learning the parts of objects by non-negative matrix factoriza-tion”, Nature, 401, 1999, pp.788-791. doi:10.1038/44565

[7] T. Anderson, towards a theory of on-line learning. In T. Anderson, & F. Elloumi (Eds.), Theory and practice of online learning, pp. 33-60, 2004, Athabasca Univer-sity Press.

[8] W. Xu, X. Liu and Y. Gong, “Document clustering based on non-negative matrix factorization”, Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval, 2003, pp. 267-273.

[9] C. Ding, T. Li and W. Peng, “Orthogonal non-negative matrix t-factorizations for clustering”, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, 2006, pp. 126-135. doi:10.1145/1150402.1150420

[10] X. Huang and W. Lai, “Clustering Graphs for Visualization via Node Similarities”, Journal of Visual Languages and Computing, Vol.17, No.3, 2006, pp. 225-253. doi:10.1016/j.jvlc.2005.10.003

[11] X. Huang, W. Lai, A. S. M. Sajeev and J. Gao, “A New Algorithm to Remove Overlapping Nodes in Graph Layout”, Information Sciences, Vol. 177, No. 14, 2007, pp. 2821-2844. doi:10.1016/j.ins.2007.02.016

[12] C. R. Romero and S. Ventura, “Educational Data Mining: A Review of the State of the Art.” IEEE Transactions on Systems, Man and Cybernetics, Part C: Applications andReviewsVol.40, No.6, 2010, pp. 601–618.

[13] J. Hung and K. Zhang, “Revealing online learning behaviours and activity patterns and making predictions with data mining techniques in online teaching” MERLOT Journal of Online Learning and Teaching, Vol.4, No.4, 2008.

[14] F.-R. Lin, L.-S. Hsieh and &F.-T. Chuang, “Discovering genres of online discussion threads via text mining”, Computers &Education, Vol.52, No.2, 2009, pp.481-495. doi:10.1016/j.compedu.2008.10.005

[15] G. Codo, D. Garcia, E. Santamaria, J. A. Moran, J. Melenchon and C. Monzo, “Modelling students’ activity in online discussion forms: a strategy based on time series and agglomerative hierarchical clustering”, Proceedings of Educational Data Mining, pp.253-258, 2011.

[1] T. Opsahl, “Triadic Closure in Two-mode Networks: Redefining the Global and Local Clustering Coefficients,” Social Networks, Vol. 35, 2013. doi:10.1016/j.socnet.2011.07.001.

[2] P. D. Laurie and T. Ellis, “Using Data Mining as a Strategy for Assessing Asynchronous Ddiscussion Forums,” Computers & Education, Vol. 45, No. 1, 2005, pp. 141-160. doi:10.1016/j.compedu.2004.05.003

[3] N. Lia and D. D. Wub, “Using Text Mining and Sentiment Analysis for Online Forums Hotspot Detection and Forecast,” Decision Support Systems, Vol. 48, No. 2, 2010, pp. 354-368. doi:10.1016/j.dss.2009.09.003

[4] A. Silva, “Visual Analysis of Online Interactions through Social Network Patterns,” IEEE 12th International Conference on Advanced Learning Technologies (ICALT), 2012, pp. 639- 641.

[5] X. Huang, X. Zheng, W. Yuan, F. Wang and S. Zhu, “Enhanced Clustering of Biomedical Documents Using Ensemble Non-negative Matrix Factorization”, Information Sciences, Vol. 181, No.11, 2011, pp. 2293-2302. doi:10.1016/j.ins.2011.01.029

[6] D. D. Lee, H. S. Seung, “Learning the parts of objects by non-negative matrix factoriza-tion”, Nature, 401, 1999, pp.788-791. doi:10.1038/44565

[7] T. Anderson, towards a theory of on-line learning. In T. Anderson, & F. Elloumi (Eds.), Theory and practice of online learning, pp. 33-60, 2004, Athabasca Univer-sity Press.

[8] W. Xu, X. Liu and Y. Gong, “Document clustering based on non-negative matrix factorization”, Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval, 2003, pp. 267-273.

[9] C. Ding, T. Li and W. Peng, “Orthogonal non-negative matrix t-factorizations for clustering”, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, 2006, pp. 126-135. doi:10.1145/1150402.1150420

[10] X. Huang and W. Lai, “Clustering Graphs for Visualization via Node Similarities”, Journal of Visual Languages and Computing, Vol.17, No.3, 2006, pp. 225-253. doi:10.1016/j.jvlc.2005.10.003

[11] X. Huang, W. Lai, A. S. M. Sajeev and J. Gao, “A New Algorithm to Remove Overlapping Nodes in Graph Layout”, Information Sciences, Vol. 177, No. 14, 2007, pp. 2821-2844. doi:10.1016/j.ins.2007.02.016

[12] C. R. Romero and S. Ventura, “Educational Data Mining: A Review of the State of the Art.” IEEE Transactions on Systems, Man and Cybernetics, Part C: Applications andReviewsVol.40, No.6, 2010, pp. 601–618.

[13] J. Hung and K. Zhang, “Revealing online learning behaviours and activity patterns and making predictions with data mining techniques in online teaching” MERLOT Journal of Online Learning and Teaching, Vol.4, No.4, 2008.

[14] F.-R. Lin, L.-S. Hsieh and &F.-T. Chuang, “Discovering genres of online discussion threads via text mining”, Computers &Education, Vol.52, No.2, 2009, pp.481-495. doi:10.1016/j.compedu.2008.10.005

[15] G. Codo, D. Garcia, E. Santamaria, J. A. Moran, J. Melenchon and C. Monzo, “Modelling students’ activity in online discussion forms: a strategy based on time series and agglomerative hierarchical clustering”, Proceedings of Educational Data Mining, pp.253-258, 2011.