ENG  Vol.5 No.10 B , October 2013
Prediction of Peptides Binding to Major Histocompatibility Class II Molecules Using Machine Learning Methods
Abstract: In daily life,we are frequently attacked by infection organisms such as bacteria and viruses. Major Histocompatibility (MHC) molecules have an essential role in T-cell activation and initiating an adaptive immune response. Development of methods for prediction of MHC-Peptide binding is important in vaccine design and immunotherapy. In this study, we try to predict the binding between peptides and MHC class II. Support vector machine (SVM) and Multi-Layer Percep-tron (MLP) are used for classification. These classifiers based on pseudo amino acid compositions of data that we ex-tracted from PseAAC server, classify the data. Since, the dataset, used in this work, is imbalanced, we apply a pre-processing step to over-sample the minority class and come over this problem. The results show that using the concept of pseudo amino acid composition and applying over-sampling method, increases the performance of predictor. Fur-thermore, the results demonstrate that using the concept of PseAAC and SVM is a successful method for the prediction of MHC class II molecules.
Cite this paper: Faramarzi, F. , Beigi, M. , Botorabi, Y. and Mousavi, N. (2013) Prediction of Peptides Binding to Major Histocompatibility Class II Molecules Using Machine Learning Methods. Engineering, 5, 513-517. doi: 10.4236/eng.2013.510B105.

[1]   H. Yu, X. Zhu and M. Huang, “Using String Kernel to Predict Binding Peptides for MHC Class II Molecules,” The 8th International Conference on Signal Processing, 2006.

[2]   V. Brusic, G. Rudy, M. Honeyman, J. Hammer and L. Harrison, “Prediction of MHC Class II-Binding Peptides Using an Evolutionary Algorithm and Artificial Neural Network,” Bioinformatics, Vol. 14, 1998, pp. 121-130.

[3]   J. Cui, L. Han, H. Lin, H. Zhang, Z. Tang, C. J. Zheng, Z. W. Cao and Y. Z. Chen, “Prediction of MHC Binding Peptides of Flexible Lengths from Sequence-Derived Structural and Physicochemical Properties,” Molecular Immunology, Vol. 44, No. 5, 2007, pp. 866-877.

[4]   C. Leslie and E. Eskin, “The Spectrum Kernel: A String Kernel for SVM Protein Classification,” Proceedings of the Pacific Symposium on Biocomputing, Vol. 7, 2002, pp. 566-575.

[5]   H. Saigo, J. Vert, N. Ueda and T. Akutsu, “Protein Homology Detection Using String Alignment Kernels,” Bioinformatics, Vol. 20,2004, pp. 1682-1689.

[6]   K. C. Chou, “Prediction of Protein Cellular Attributes Using Pseudo-Amino Acid Composition,” Proteins, Vol. 43, 2001, pp. 246-255.

[7]   Y. EL-Manzalawy, D. Dobbs and V. Honar, “On Evaluating MHC-II Binding Peptide Prediction Methods,” PLoS One, Vol. 3, 2008.

[8]   K. C. Chou, “Pseudo Amino Acid Composition and Its Applications in Bioinformatics, Proteomics and System Biology,” Proteomics, Vol. 6, 2009, pp. 262-274.

[9]   H. Mohabatkar, M. Mohammad Beigi and A. Esmaeili, “Prediction of GABAA Receptor Proteins Using the Concept of Chou’s Pseudo-Amino Acid Composition and Support Vector Machine,” Journal of Theoretical Biology, Vol. 281, 2011, pp. 18-23.

[10]   J. Luengo, A. Fernández, S. García and F. Herrera, “Addressing Data Complexity for Imbalanced Data Sets: Analysis of SMOTE-Based Oversampling and Evolutionary Undersampling,” Soft Computing, Vol. 15, 2011, pp. 1909-1936.

[11]   H. Han, W. Y. Wang and B. H. Mao, “Borderline- SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning,” International Conference on Intelligent Computing, 2005, pp. 878-887.

[12]   G. Raghava, “Evaluation of MHC Binding Peptide Prediction Algorithms”.