JILSA  Vol.5 No.3 , August 2013
Evaluation and Comparison of Different Machine Learning Methods to Predict Outcome of Tuberculosis Treatment Course

Tuberculosis treatment course completion is crucial to protect patients against prolonged infectiousness, relapse, lengthened and more expensive therapy due to multidrug resistance TB. Up to 50% of all patients do not complete treatment course. To solve this problem, TB treatment with patient supervision and support as an element of the “global plan to stop TB” was considered by the World Health Organization. The plan may require a model to predict the outcome of DOTS therapy; then, this tool may be used to determine how intensive the level of providing services and supports should be. This work applied and compared machine learning techniques initially to predict the outcome of TB therapy. After feature analysis, models by six algorithms including decision tree (DT), artificial neural network (ANN), logistic regression (LR), radial basis function (RBF), Bayesian networks (BN), and support vector machine (SVM) developed and validated. Data of training (N = 4515) and testing (N = 1935) sets were applied and models evaluated by prediction accuracy, F-measure and recall. Seventeen significantly correlated features were identified (P <= 0.004; 95% CI = 0.001 - 0.007); DT (C 4.5) was found to be the best algorithm with %74.21 prediction accuracy in comparing with ANN, BN, LR, RBF, and SVM with 62.06%, 57.88%, 57.31%, 53.74%, and 51.36% respectively. Data and distribution may create the opportunity for DT out performance. The predicted class for each TB case might be useful for improving the quality of care through making patients’ supervision and support more case—sensitive in order to enhance the quality of DOTS therapy.

Cite this paper
S. Kalhori and X. Zeng, "Evaluation and Comparison of Different Machine Learning Methods to Predict Outcome of Tuberculosis Treatment Course," Journal of Intelligent Learning Systems and Applications, Vol. 5 No. 3, 2013, pp. 184-193. doi: 10.4236/jilsa.2013.53020.
[1]   A. D. Harries and C. Dye, “Tuberculosis,” Annals of Tro pical Medicine and Parasitology, Vol. 100, No. 5, 2006, pp. 415-443. doi:10.1179/136485906X91477

[2]   World Health Organization, “The Stop TB Strategy, Buil ding on and Enhancing DOTS to Meet the TB-Related Millennium Development Goals,” 2006.

[3]   W. D. Cuneo and D. J. Snider, “Enhancing Patient Com pliance with Tuberculosis Therapy,” Clinics in Chest Me dicine, Vol. 10, No. 3, 1989, pp. 375-380.

[4]   H. G. Tangüis, J. A. Caylà, P. García, J. M. Jansà and M. T. Brugal, “Factors Predicting Non-Completion of Tu berculosis Treatment among HIV-Infected Patients in Barcelona (1987-1996),” The International Journal of Tu berculosis and Lung Disease, Vol. 4, No. 1, 2000, pp. 55-60.

[5]   W. W. Yew, “Directly Observed Therapy Short-Course: The Best Way to Prevent Multidrug-Resistant Tuberculo sis,” Chemotherapy, Vol. 45, No. 2, 1999, pp. 26-33. doi:10.1159/000048479

[6]   J. Legrand, A. Sanchez, F. Le Pont, L. Camacho and B. Larouze, “Modeling the Impact of Tuberculosis Control Strategies in Highly Endemic Overcrowded Prisons,” Plos One, Vol. 3, No. 5, 2008, Article ID: e2100. doi:10.1371/journal.pone.0002100

[7]   S. Thiam, A. M. Le Fevre and F. Hane, “Effectiveness of a Strategy to Improve Adherence to tuberculosis Treat ment in a Resource-Poor Setting: A Cluster Randomized Controlled Trial,” Journal of the American Medical As sociation, Vol. 297, No. 4, 2007, pp. 380-386. doi:10.1001/jama.297.4.380

[8]   W. J. Burman, D. L. Cohn, C. A. Rietmeijer, F. N. Judson, J. A. Sbarbaro and R. R. Reves, “Noncompliance with Directly Observed Therapy for Tuberculosis. Epidemiol ogy and Effect on the Outcome of Treatment” Chest, Vol. 111, No. 5, 1997, pp. 1168-1173. doi:10.1378/chest.111.5.1168

[9]   P. D. O. Davies, “The Role of DOTS in Tuberculosis Treatment and Control,” American Journal of Respiratory Medicine, Vol. 2, No. 3, 2003, pp. 203-209. doi:10.1007/BF03256649

[10]   A. V. Sitar-Taut, D. Zdrenghea, D. Pop and D. A. Sitar Taut, “Using Machine Learning Algorithms in Cardio vascular Disease Risk Evaluation,” Journal of Applied Computer Science & Mathematics, Vol. 5, No. 3, 2009, pp. 29-32.

[11]   M. Lazarescu, A. Turpin and S. Venkatesh, “An Appli cation of Machine Learning Techniques for the Classifi cation of Glaucomatous Progression,” Vol. 2396, Sprin ger-Verlag, Berlin, 2006.

[12]   J. I. Serrano, M. Tomécková and J. Zvárová, “Machine Learning Methods for Knowledge Discovery in Medical Data on Atherosclerosis,” European Journal of Biomedi cal Informatics, 2006.

[13]   I. Guyon and A. Elissee, “An Introduction to Variable and Feature Selection,” Journal of Machine Learning Re search, Vol. 3, 2003, pp. 1157-1182.

[14]   M. Dash and H. Liu, “Feature Selection for Classifica tion,” Intelligent Data Analysis, Vol. 1, No. 3, 1997, pp. 131-156. doi:10.1016/S1088-467X(97)00008-5

[15]   A. Field, “Discovering Statistics Using SPSS,” 2nd Edi tion, SAGE Publication LTD, London, 2005.

[16]   J. Han and M. Kamber, “Data Mining: Concepts and Techniques,” 2nd Edition, Morgan Kaufmann Publishers, Burlington, 2006.

[17]   E. Alpaydin, “Introduction to Machine Learning,” 1th Edition, The MIT Press, Cambridge, 2004.

[18]   S. B. Kotsiantis, “Supervised Machine Learning: A Re view of Classification Techniques,” Informatica, Vol. 31, No. 3, 2007, pp. 249-268.

[19]   E. Vittinghoff, S. C. Shiboski, D. V. Glidden and C. E. McCulloch, “Regression Methods in Biostatistics, Linear, Logistic, Survival, and Repeated Measures Models,” Sprin ger, Berlin, 2005.

[20]   S. Marsland, “Machine Learning: An Algorithmic Perspe ctive,” 1st Edition, Chapman and Hall, London, 2009.

[21]   L. Olson and D. Delen, “Advanced Data Mining Techni ques,” Springer, Berlin, 2008.

[22]   R. D. King, C. Feng and A. Sutherland, ”Statlog: Com parison of Classification Algorithms on Large Real World Problems,” Applied Artificial Intelligence, Vol. 9, No. 3, 1995, pp. 289-333.

[23]   I. Kurt, M. True and A. T. Kurum, “Comparing Performances of Logistic Regression, Classification and Regression Tree, and Neural Networks for Predicting Coronary Artery Disease,” Expert System Application, Vol. 34, 2008, pp. 366-374. doi:10.1016/j.eswa.2006.09.004

[24]   J. V. Tu, “Advantages and Disadvantages of Using Arti ficial Neural Networks Versus Logistic Regression for Predicting Medical Outcomes,” Journal of Clinical Epi demiology, Vol. 49, No. 11, 1996, pp. 1225-1231. doi:10.1016/S0895-4356(96)00002-9