ABSTRACT In the efforts to understand the molecular characteristics responsible for the ability of influenza viruses to cross species, various amino acid host markers in influenza viruses were uncovered. Our previous study identified a collection of novel amino acid host markers in ten proteins of 2009 pandemic H1N1. As an extension of our prior work, the objective of the current study was to employ Random Forests, a robust pattern recognition technique, to discover nucleotide host makers in the ten corresponding genes of 2009 pandemic H1N1, along with those in the genes of avian and swine viruses. Although different, there was an association between the amino acid markers in proteins and the nucleotide markers in the related genes due to codon translations. Moreover, nucleotide host markers have the capability to indicate important positions within a codon for host switches as well as the significance of synonymous mutations on host shifts, all of which amino acid markers could not provide. Our findings highlighted that two or even three nucleotide markers could coexist within a single codon, and the different importance values of these markers could further discri- minate the multiple markers within a codon. The nucleotide markers found in this study rendered a comprehensive genomic view of the complex and systemic nature of host adaptation. They verified and enriched the known amino acid markers and offered a larger set of finer host markers for further experimental confirmation.
Cite this paper
Hu, W. (2010) Nucleotide host markers in the influenza A viruses. Journal of Biomedical Science and Engineering, 3, 684-699. doi: 10.4236/jbise.2010.37093.
 Chen, G.W., Chang, S.C., Mok, C.K., Lo, Y.L., Kung, Y.N., et al. (2006) Genomic signatures of human versus avian influenza A viruses. Emerging Infectious Diseases, 12(9), 1353-1360.
Chen, G.W. and Shih, S.R. (2009) Genomic signatures of influenza A pandemic (H1N1) 2009. Emerging Infectious Diseases, 15(12), 1897-1903.
Pan, C., Cheung, B., Tan, S., Li, C., Li, L., et al. (2010) Genomic signature and mutation trend analysis of pandemic (H1N1) 2009. Influenza A Virusus PLoS One, 5(3), e9549.
Miotto, O., Heiny, A., Tan, T.W., August, J.T. and Brusic, V. (2008) Identification of human-to-human transmissibility factors in PB2 proteins of influenza A by large- scale mutual information analysis. BMC Bioinformatics., 9(Suppl 1), S18.
Miotto, O., Heiny, A.T., Albrecht, R., García-Sastre, A., Tan, T.W., August, J.T. and Brusic, V. (2010) Complete-proteome mapping of human influenza A adaptive mutations: Implications for human transmissibility of zoonotic strains. PLoS One, 5(2), e9025.
Finkelstein, D.B., Mukatira, S., Mehta, P.K., Obenauer, J.C., Su, X., Webster, R.G. and Naeve, C.W. (2007) Persistent host markers in pandemic and H5N1 influenza viruses. Journal of Virology, 81(19), 10292-10299.
Allen, J.E., Gardner, S.N., Vitalis, E.A. and Slezak, T.R. (2009) Conserved amino acid markers from past influenza pandemic strains. BMC Microbiology, 9(1), 77.
Hu, W. (2010) Novel host markers in the 2009 pandemic H1N1 influenza A virus. Journal of Biomedical Science and Engineering, 3(6), 584-601.
Herfst, S., Chutinimitkul, S., Ye, J., de Wit, E., Munster, V.J., Schrauwen, E.J., Bestebroer, T.M., Jonges, M., Meijer, A., Koopmans, M., Rimmelzwaan, G.F., Osterhaus, A.D., Perez, D.R. and Fouchier, R.A. (2010) Introduction of virulence markers in PB2 of pandemic swine-origin influenza virus does not result in enhanced virulence or transmission, Journal of Virology, 84(8), 3752-3758.
Mehle, A. and Doudna, J.A. (2009) Adaptive strategies of the influenza virus polymerase for replication in humans. Proceedings of National Academic Science in USA., 106(50), 21312-21316.
Alexande,r S., Benjamin, G., Gustavo, P., Ian Lipkin, W. and Raul R. (2010) Host dependent evolutionary patterns and the origin of 2009 H1N1 pandemic influenza. PLoS Current Influenza, RRN1147.
Rabadan, R., Levine, A.J. and Robins, H. (2006) Comparison of avian and human influenza A viruses reveals a mutational bias on the viral genomes. Journal of Virology, 80(23), 11887-11891.
Microbiol Biotechnol, J. (2010) Comparative study of the nucleotide bias between the novel H1N1 and H5N1 subtypes of influenza A viruses using bioinformatics techniques. Ahn I, Son HS. Bioinformatics Team, 20(1), 63- 70.
Valli, M.B., Meschi, S., Selleri, M., Zaccaro, P., Ippolito, G., Capobianchi, M.R. and Menzo, S. (2010) Evolutionary pattern of pandemic influenza (H1N1) 2009 virus in the late phases of the 2009 pandemic. PLoS Current Influenza, RRN1149.
Ramakrishnan, M.A., Gramer, M.R., Goyal, S.M. and Sreevatsan, S. (2009) A Serine12Stop mutation in PB1- F2 of the 2009 pandemic (H1N1) influenza A: a possible reason for its enhanced transmission and pathogenicity to humans. Journal of Veterinary Science, 10(4), 349-351.
Katoh, K., Kuma, K., Toh, H. and Miyata, T. (2005) MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acid Research, 33, 511-518.
Breiman, L. (2001) Random Forests, Machine Learning, 45(1), 5-32.
Díaz-Uriarte, R. and Alvarez de Andrés, S. (2006) Gene selection and classification of microarray data using random forest. BMC Bioinformatics, 7(3), 3-16.
Archer, K.J. and Kimes, R.V. (2008) Empirical characterization of random forest variable importance measures. Computational Statistics and Data Analysis, 52(4), 2249- 2260.
Reif, D.M., Motsinger, A.A., McKinney, B.A., Crowe, J.E. and Moore, J.H. (2006) Feature selection using a random forests classifier for the integrated analysis of multiple data types. Proceedings of 2006 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology, CIBCB ’06.
Granittoa, P.M., Furlanellob, C., Biasiolia, F. and Gas- peria, F. (2006) Recursive feature elimination with random forest for PTR-MS analysis of agroindustrial products. Chemometrics and Intelligent Laboratory Systems, 83(2), 83-90.
Menze1, B.H., Kelm, B.M., Masuch, R., Himmelreich, U., Bachert, P., Petrich, W. and Hamprecht, F.A. (2009) A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data. BMC Bioinformatics, 10, 213.
Gao, D., Zhang, Y.-X. and Zhao, Y.-H. (2009) Random forest algorithm for classification of multi-wavelength data. Research in Astronomy and Astrophysics, 9(2), 220- 226.
Hu, W. (2009) Identifying predictive markers of chemosensitivity of breast cancer with random forests. Journal of Biomedical Science and Engineering, 3(1), 59-64.
Gavin, J.D., Smith, D.V., Justin, B., Samantha, J.L., et al. (2009) Origins and evolutionary genomics of the 2009 swine-origin H1N1 influenza A epidemic. Nature, 459(7250), 1122-1125.
Hu, W. (2010) Quantifying the effects of mutations on receptor binding specificity of influenza viruses. Journal of Biomedical Science and Engineering, 3(3), 227-240.
KováccaronOVá, A., Ruttkay-Nedecky, G., HaverlíK1, I.K. and Janecccaronek, S. (2002) Sequence similarities and evolutionary relationships of influenza virus A hemagglutinins. Virus Genes, 24(1), 57-63.
Colman, P.M., Hoyne, P.A. and Lawrence, M.C. (1993) Sequence and structure alignment of paramyxovirus hemagglutinin-neuraminidase with influenza virus neuraminidase. Journal of Virology, 67(6), 2972-2980.
Maurer-Stroh, S., Ma, J.M., Lee, R.T.C., Sirota, F.L. and Eisenhaber, F. (2009) Mapping the sequence mutations of the 2009 H1N1 influenza A virus neuraminidase relative to drug and antibody binding sites. Biology Direct, 4, 18.
Baudin, F., Petit, I., Weissenhorn, W. and Ruigrok, R.W.H. (2001) In vitro dissection of the membrane binding and RNP binding activities of influenza virus M1 protein. Virology, 281(1), 102-108.
Furuse, Y., Suzuki, A., Kamigaki, T. and Oshitani, H. (2009) Evolution of the M gene of the influenza A virus in different host species: Large-scale sequence analysis. Journal of Virology, 6(1), 67.
Yang, H., Carney, P. and Stevens, J. (2010) Structure and Receptor binding properties of a pandemic H1N1 virus hemagglutinin. PLoS Current Influenza, RRN1152.
Dundon, W.G. and Capua, I. (2009) A closer look at the NS1 of influenza virus. Viruses, 1(3), 1057-1072.
Lin, D., Lan, J. and Zhang, Z. (2007) Structure and function of the NS1 protein of influenza A virus. Acta Biochim Biophys Sin (Shanghai), 39(3), 155-162.
Ye, Q., Krug, R.M. and Tao, Y.J. (2006) The mechanism by which influenza A virus nucleoprotein forms oligomers and binds RNA. Nature, 444(7122), 1078-1082.
Liu, X. and Zhao, Y.P. (2010) Switch region for pathogenic structural change in conformational disease and its prediction. PLoS One, 5(1), e8441.
Yuan, P.W., Bartlam, M., Lou, Z.Y., Chen, S.D., Zhou, J., He, X.J., Lv, Z.Y., Ge, R.W., Li, X.M., Deng, T., Fodor, E., Rao, Z.H. and Liu, Y.F. (2009) Crystal structure of an avian influenza polymerase PAN reveals an endonuclease active site. Nature, 458(7240), 909-913.