ABSTRACT The influenza A viruses have three gene segments, M, NS, and PB1, which code for more than one protein. The overlapping genes from the same segment entail their interdependence, which could be reflected in the evolutionary constraints, host distinction, and co-mutations of influenza. Most previous studies of overlapping genes focused on their unique evolutionary constraints, and very little was achieved to assess the potential impact of the overlap on other biological aspects of influenza. In this study, our aim was to explore the mutual dependence in host differentiation and co-mutations in M, NS, and PB1 of avian, human, 2009 H1N1, and swine viruses, with Random Forests, information entropy, and mutual information. The host markers and highly co-mutated individual sites and site pairs (P values < 0.035) in the three gene segments were identified with their relative significance between the overlapping genes calculated. Further, Random Forests predicted that among the three stop codons in the current PB1-F2 gene of 2009 H1N1, the significance of a mutation at these sites for host differentiation was, in order from most to least, that at 12, 58, and 88, i.e., the closer to the start of the gene the more important the mutation was. Finally, our sequence analysis surprisingly revealed that the full-length PB1-F2, if the three stop codons were all mutated, would function more as a swine protein than a human protein, although the PB1 of 2009 H1N1 was derived from human H3N2.
Cite this paper
Hu, W. (2010) Host markers and correlated mutations in the overlapping genes of influenza viruses: M1, M2; NS1, NS2; and PB1, PB1-F2. Natural Science, 2, 1225-1246. doi: 10.4236/ns.2010.211150.
 Betakova, T. (2007) M2 Protein–a proton channel of influenza a virus. Current Pharmaceutical Design, 13, 3231-3235.
Ma, C., Polishchuk, A.L., Ohigashi, Y., Stouffer, A.L., Sch?n, A., Magavern, E., Jing, X., Lear, J.D., Freire, E. and Lamb, R.A., (2009) Identification of the functional core of the influenza A virus A/M2 proton-selective ion channel. Proceedings of the National Academy of Sciences of the United States of America, 106, 12283- 12288.
Dua, Q.S., Wang, S.Q., Huang, R.B. and Chou, K.C. (2010) Computational 3D structures of drug-targeting proteins in the 2009-H1N1 influenza A virus. Chemical Physics Letters, 485, 191-195.
Pan, C., Cheung, B., Tan, S., Li, C., Li, L., et al. (2010) Genomic signature and mutation trend analysis of pandemic (H1N1) 2009 influenza A virus. PLoS ONE, 5, e9549.
Jackson, D., Hossain, M.J., Hickman, D., Perez, D.R. and Lamb, R.A. (2008) A new influenza virus virulence determinant: The NS1 protein four C-terminal residues modulate pathogenicity. Proceedings of the National Academy of Sciences of the United States of America, 105, 4381-4386.
Soubies, S.M., Volmer, C., Croville, G., Loupias, J., Peralta, B., Costes, P., Lacroux, C., Guérin, J.L. and Volmer, R. (2010) Species-specific contribution of the four C-terminal amino acids of influenza A virus NS1 protein to virulence. The Journal of Virology, 84, 6733-6747.
Hale, B.G., Steel, J., Manicassamy, B., Medina, R.A., Ye, J., Hickman, D., Lowen, A.C., Perez, D.R. and García- Sastre, A. (2010) Mutations in the NS1 C-terminal tail do not enhance replication or virulence of the 2009 pandemic H1N1 influenza A virus. The Journal of General Virology, 91, 1737-1742.
Long, J.X., Peng, D.X., Liu, Y.L., Wu, Y.T. and Liu, X.F. (2008). Virulence of H5N1 avian influenza virus enhanced by a 15-nucleotide deletion in the viral nonstructural gene. Virus Genes. 36, 471-478.
Seo, S.H., Hoffmann, E. and Webster, R.G. (2002) Lethal H5N1 influenza viruses escape host anti-viral cytokine responses. Nature Medicine, 8, 950-954.
Henkel, M., Mitzner, D., Henklein, P., Meyer-Almes, F-J., Moroni, A. et al., (2010) The proapoptotic influenza A virus protein PB1-F2 forms a nonselective ion channel. PLoS ONE, 5, e11112.
McAuley, J.L., Zhang, K. and McCullers, J.A. (2010) The effects of influenza a virus PB1-F2 protein on polymerase activity are strain specific and do not impact pathogenesis. Journal of Virology, 84, 558-564.
Krejnusová, I., Gocníková, H., Bystrická, M., Bennink, H.J. and Russ, G. (2009) Antibodies to PB1-F2 protein are induced in response to influenza a virus infection. Archives of Virology, 154, 1599-1604
Zell, R., Krumbholz, A., Eitner, A., Krieg, R., Halbhuber, K.J. and Wutzler, P. (2007) Prevalence of PB1-F2 of influenza A viruses. Journal General Virology, 88, 536-546.
Conenello, G., Zamarin, D., Perrone, L., Tumpey, T., and Palese, P. (2007). A single mutation in the PB1-F2 of H5N1 (HK/97) and 1918 influenza A viruses contributes to increased virulence. PLoS Pathogens, 3, 1414-1421.
Mcauley, J., Hornung, F., Boyd, K., Smith, A., Mckeon, R., Bennink, J., Yewdell, J., and Mccullers, J. (2007) Expression of the 1918 influenza A virus PB1-F2 enhances the pathogenesis of viral and secondary bacterial pneumonia. Cell Host & Microbe, 2, 240-249.
Hai, R., Schmolke, M., Varga, Z.T., Manicassamy, B., Wang, T.T., Belser, J.A., Pearce, M.B., García-Sastre, A., Tumpey, T.M. and Palese, P. (2010) PB1-F2 expression by the 2009 pandemic H1N1 influenza virus has minimal impact on virulence in animal models. Journal Virology, 84, 4442-4450.
Opal, S. (2010) Understanding viral zoonoses: H1N1 influenza. Veterinary Medicine, 3, 131-135.
Pavesi, A. (2007) Pattern of nucleotide substitution in the overlapping nonstructural genes of influenza A virus and implication for the genetic diversity of the H5N1 subtype. Gene, 402, 28-34.
Campitelli, L., Ciccozzi, M., Salemi, M., Taglia, F., Boros, S., et al. (2006) H5N1 influenza virus evolution: A comparison of different epidemics in birds and humans (1997-2004). Journal of General Virology, 87, 955-960.
Obenauer, J.C., Denson, J., Mehta, P.K., Su, X., Mukatira, S., et al. (2006) Large-scale sequence analysis of avian influenza isolates. Science, 311, 1576-1580.
Li, K.S., Guan, Y., Wang, J., Smith, G.J., Xu, K.M., et al. (2004) Genesis of a highly pathogenic and potentially pandemic H5N1 influenza virus in eastern Asia. Nature, 430, 209-213.
Sabath, N., Landan, G. and Graur, D. (2008) A Method for the simultaneous estimation of selection intensities in overlapping genes. PLoS ONE, 3, e3996.
Hu, W. (2010) Novel host markers in the 2009 pandemic H1N1 influenza A virus. Journal of Biomedical Science and Engineering, 3, 584-601.
Hu, W. (2010) Nucleotide host markers in the influenza A viruses. Journal of Biomedical Science and Engineering, 3, 684-699.
Katoh, K., Kuma, K., Toh, H. and Miyata, T. (2005) MAFFT version 5: Improvement in accuracy of multiple sequence alignment. Nucleic Acids Research, 33, 511- 518.
MacKay, D. (2003) Information theory, inference, and learning algorithms. Cambridge University Press, UK.
Breiman, L., (2001) Random forests. Machine Learning, 45, 5-32.
Díaz-Uriarte, R. and Alvarez de Andrés, S. (2006) Gene selection and classification of microarray data using random forest. BMC Bioinformatics, 7, 3.
Kellie, J.A. and Ryan, V.K. (2008) Empirical characterization of random forest variable importance measures. Computational Statistics and Data Analysis, 52, 2249- 2260.
Reif, D.M., Motsinger, A.A., McKinney, B.A., Crowe, J.E., Moore, J.H. (2006) Feature Selection using a random forests classifier for the integrated analysis of multiple data type. Proceedings of 2006 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology, CIBCB '06, 1-8.
Pablo, M.G., Furlanellob, C., Biasiolia, F. and Gasperia, F. (2006) Recursive feature elimination with random forest for PTR-MS analysis of agroindustrial products. Chemometrics and Intelligent Laboratory Systems, 83, 83-90.
Bjoern, H.M., Kelm, B.M., Masuch, R., Himmelreich, U., Bachert, P., Petrich, W. and Hamprecht, F.A. (2009) A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data. BMC Bioinformatics, 10, 213.
Gao, D., Zhang, Y.X., Zhao, Y.H. (2009) Random forest algorithm for classification of multi-wavelength data. Research in Astronomy and Astrophysics, 9, 220-226.
Hu, W. (2009) Identifying predictive markers of chemosensitivity of breast cancer with random forests. Journal of Biomedical Science and Engineering, 3, 59-64.
Garten, R.J., Davis, C.T., Russell, C.A., Shu, B., et al., (2009) Antigenic and genetic characteristics of swine-origin 2009 A(H1N1) influenza viruses circulating in humans. Science, 325, 197-201.
Ozawa, M., Maeda, J., Iwatsuki-Horimoto, K., Watanabe, S., Goto, H., Horimoto, T. and Kawaoka, Y. (2009) Nucleotide sequence requirements at the 5' end of the influenza A virus M RNA segment for efficient virus replication. J Virol. 83(7):3384-8.
Wu B, Wang C M, Dong G Y, et al. (2009) Molecular characterization of H1N1 influenza A viruses from human cases in North America. Chinese Sciences Bulletin, 54, 2179-2192
Liu, T. and Ye, Z.P. (2005) Attenuating mutations of the matrix gene of influenza A/WSN/33 Virus. Journal of Virology, 79, 1918-1923.
Deyde, V.M., Sheu, T.G., Trujillo, A.A., Okomo-Adhiambo, M., Garten, R., Klimov, A.I., Gubareva, L.V. (2010) Detection of molecular markers of drug resistance in 2009 pandemic influenza A (H1N1) viruses by pyrosequencing. Antimicrob Agents Chemother, 54, 1102-1110.
Pan, C.G. and Jiang, S.B. (2009) E14-F55 combination in M2 protein: A putative molecular determinant responsible for swine-origin influenza A virus transmission in humans. PLoS Currents Influenza, 29, RRN1044.
Sabath, N., Landan, G. and Graur, D. (2008) A method for the simultaneous estimation of selection intensities in overlapping genes. PLoS ONE, 3, e3996.
Lin, D., Lan, J. and Zhang, Z. (2007) Structure and function of the NS1 protein of influenza A virus. Acta Biochim Biophys Sin (Shanghai), 39, 155-162.
Suwannakhon, N., Pookorn, S., Sanguansermsri, D., Chamnanpood, C., Chamnanpood, P., Wongvilairat, R., Pongcharoen, S., Niumsup, P.R., Kunthalert, D. and Sanguansermsri, P. (2008) Genetic characterization of nonstructural genes of H5N1 avian influenza viruses isolated in Thailand in 2004-2005. Southeast Asian Journal of Tropical Medicine and Public Health, 39, 837-847.
Scalera, N.M. and Mossad, S.B. (2009) The first pandemic of the 21st century: A review of the 2009 pandemic variant influenza A (H1N1) virus. Postgraduate Medicine, 121, 43-47.
Gibbs, A.J., Armstrong, J.S., Downie, J.C. (2009) From where did the 2009 'swine-origin' influenza A virus (H1N1) emerge? Journal of Virology, 6, 207.
Jiao, P.R., Tian, G.B., Li, Y.B., Deng, G.H., Jiang, Y.P., Liu, C., Liu, W.L., Bu, Z.G., Kawaoka, Y. and Chen, H.L. (2008) A single-amino-acid substitution in the NS1 protein changes the pathogenicity of H5N1 avian influenza viruses in mice. Journal of Virology, 82, 1146-1154.
Betakova1, T. and Hay, A.J. (2009) Stability and function of the influenza A virus M2 ion channel protein is determined by both extracellular and cytoplasmic domains. Journal Archives of Virology, 154, 147-151.
Miotto, O., Heiny, A.T., Albrecht, R., García-Sastre, A., Tan, T.W., August, J.T. and Brusic, V. (2010) Complete-proteome mapping of human influenza a adaptive mutations: Implications for human transmissibility of zoonotic strains. PLoS ONE, 5, e9025.
Maurer-Stroh, S., Raphael, T.C.L., Eisenhaber, F., Lin C., Shiau, P.P. and Raymond, T.P.L. (2010) A new common mutation in the hemagglutinin of the 2009 (H1N1) influenza A virus. PLoS Currents Influenza.