JSEA  Vol.5 No.12 , December 2012
ML-CLUBAS: A Multi Label Bug Classification Algorithm
Abstract: In this paper, a multi label variant of CLUBAS [1] algorithm, ML-CLUBAS (Multi Label-Classification of software Bugs Using Bug Attribute Similarity) is presented. CLUBAS is a hybrid algorithm, and is designed by using text clustering, frequent term calculations and taxonomic terms mapping techniques, and is an example of classification using clustering technique. CLUBAS is a single label algorithm, where one bug cluster is exactly mapped to a single bug category. However a bug cluster can be mapped into the more than one bug category in case of cluster label matches with the more than one category term, for this purpose ML-CLUBAS a multi label variant of CLUBAS is presented in this work. The designed algorithm is evaluated using the performance parameters F-measures and accuracy, number of clusters and purity. These parameters are compared with the CLUBAS and other multi label text clustering algorithms.
Cite this paper: N. Nagwani and S. Verma, "ML-CLUBAS: A Multi Label Bug Classification Algorithm," Journal of Software Engineering and Applications, Vol. 5 No. 12, 2012, pp. 983-990. doi: 10.4236/jsea.2012.512113.

[1]   N. K. Nagwani and S. Verma, “CLUBAS: An Algorithm and Java Based Tool for Software Bug Classification Using Bug Attributes Similarities,” Journal of Software Engineering and Applications, Vol. 5 No. 6, 2012, pp. 436-447. doi:10.4236/jsea.2012.56050

[2]   S. Chapman, “Simmetrics, Java Based API for Text Similarity Measurement,” 2011.

[3]   C. D. Manning, P. Raghavan and H. Schuitze, “Introduction to Information Retrieval,” 2008.

[4]   H. Li, K. Zhang and T. Jiang, “Minimum Entropy Clustering and Applications to Gene Expression Analysis,” Proceedings of IEEE Computational System Bioinformatics Conference, Stanford, August 2004, pp. 142-151.

[5]   I. H. Witten, E. Frank, L. E. Trigg, M. A. Hall, G. Holmes and S. J. Cunningham, “Weka (Waikato Environment for Knowledge Analysis),” 2011.

[6]   “Android Bug Repository,” 2011.

[7]   JBoss-Seam, “Bug Repository,” 2011.

[8]   “Mozilla Bug Repository,” 2011.

[9]   MySql, “Bug Repository,” 2011.

[10]   S. Osinski, J. Stefanowski and D. Weiss, “Lingo: Search Results Clustering Algorithm Based on Singular Value Decomposition,” Proceedings of the International Intelligent Information Processing and Web Mining Conference, Zakopane, 17-20 May 2004, pp. 359-368.

[11]   S. Osinski, “An Algorithm for Clustering of Web Search Results,” Master’s thesis, Poznań University of Technology, Poznań, 2003.

[12]   O. Zamir, O. Etzioni, “Grouper: A Dynamic Clustering Interface for Web Search Results,” Computer Networks, Vol. 31, No. 11-16, 1999, pp. 1361-1374. doi:10.1016/S1389-1286(99)00054-7

[13]   O. Zamir and O. Etzioni, “Web Document Clustering: A Feasibility Demonstration,” Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Melbourne, 24-28 August 1998, pp. 46-54.

[14]   W. Li, “Random Texts Exhibit Zipf’s-Law-Like Word Frequency Distribution,” IEEE Transactions on Information Theory, Vol. 38, No. 6, 1992, pp. 1842-1845. doi:10.1109/18.165464

[15]   W. J. Reed, “The Pareto, Zipf and Other Power Laws,” Economics Letters, Vol. 74, No. 1, 2001, pp. 15-19. doi:10.1016/S0165-1765(01)00524-9