absence of medical diagnosis evidences, it is difficult for the experts to
opine about the grade of disease with affirmation. Generally many tests are
done that involve clustering or classification of large scale data. However
many tests could complicate the main diagnosis process and lead to the
difficulty in obtaining the end results, particularly in the case where many
tests are performed. This kind of difficulty could be resolved with the aid of
machine learning techniques. In this research, we present a comparative study
of different classification techniques using three data mining tools named
WEKA, TANAGRA and MATLAB. The aim of this paper is to analyze the performance
of different classification techniques for a set of large data. A fundamental
review on the selected techniques is presented for introduction purpose. The
diabetes data with a total instance of 768 and 9 attributes (8 for input and 1
for output) will be used to test and justify the differences between the
classification methods. Subsequently, the classification technique that has the
potential to significantly improve the common or conventional methods will be
suggested for use in large scale data, bioinformatics or other general
Cite this paper
R. Rahman and F. Afroz, "Comparison of Various Classification Techniques Using Different Data Mining Tools for Diabetes Diagnosis," Journal of Software Engineering and Applications
, Vol. 6 No. 3, 2013, pp. 85-97. doi: 10.4236/jsea.2013.63013
 N. J. Nilsson, “Introduction to Machine Learning,” 2010.
 M. S. Sapna and D. A. Tamilarasi, “Fuzzy Relational Equation in Preventing Neuropathy Diabetic”, International Journal of Recent Trends in Engineering, Vol. 2, No. 4, 2009, p. 126.
 L. Carnimeo and A. Giaquinto, “An Intelligent System for Improving Detection of Diabetic Symptoms in Retinal Images,” IEEE International Conference on Information Technology in Biomedicine, Ioannina, 26-28 October 2006.
 R. Radha and S. P. Rajagopalan, “Fuzzy Logic Approach for Diagnosis of Diabetes,” Information Technology Journal, Vol. 6, No. 1, pp. 96-102. doi:10.3923/itj.2007.96.102
 P. Jeatrakul and K. W. Wong, “Comparing the Performance of Different Neural Networks for Binary Classification Problems,” The 8th International Symposium on Natural Language Processing, Bangkok, 20-22 October 2009, pp. 111-115. doi:10.1109/SNLP.2009.5340935
 Q. Q. Zhou, M. Purvis and N. Kasabov, “Membership Function Selection Method for Fuzzy Neural Networks,” University of Otago, Dunedin, 2007.
 T.-H. Lin and V.-W. Soo, “Pruning Fuzzy ARTMAP Using the Minimum Description Length Principle in Learning from Clinical Databases,” Proceedings of the 9th International Conference on Tools with Artificial Intelligence, Newport Beach, 3-8 November 1997, pp. 396-403.
 F. Ensan, M. H. Yaghmaee and E. Bagheri, “FACT: A New Fuzzy Adaptive Clustering Technique,” The 11th IEEE Symposium on Computers and Communications, Sardinia, 26-29 June 2006, pp. 442-447.
 UCI Machine Learning Repository.
 S. W. Purnami, A. Embong, J. M. Zain and S. P. Rahayu, “A New Smooth Support Vector Machine and Its Applications in Diabetes Disease Diagnosis,” Journal of Computer Science, Vol. 5, No. 12, pp. 1006-1011.
 P. Werbos, “Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences,” Ph.D. Thesis, Harvard University, Cambridge, 1974.
 G. H. John and P. Langley, “Estimating Continuous Distributions in Bayesian Classifiers,” Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence, San Francisco, 1995, pp. 338-345.
 J. Quinlan, “C4.5: Programs for Machine Learning,” Morgan Kaufmann, San Mateo, 1993.
 I. H. Witten and E. Frank, “Data Mining: Practical Machine Learning Tools and Techniques,” 2nd Edition, Morgan Kaufmann, San Francisco, 2005.
 The Mathworks-Fuzzy Logic Toolbox, 2006. http://www.mathworks.ch/access/helpdeskr13/help/ toolbox/fuzzy/fuzzy.html
 Jang and J.-S. Roger, “Anfis: Adaptive-Network-Based Fuzzy Inference System,” IEEE Transactions on Systems, Man, and Cybernetics, Vol. 23, No. 3, 1993, pp. 665-685.
 J. W. Han and M. Kanber, “Data Mining Concept and Techniques,” Morgan Kaufmann Publishers, Burlington, 2000.
 Kappa Statistic.