JILSA  Vol.3 No.3 , August 2011
Learning Probabilistic Models of Hydrogen Bond Stability from Molecular Dynamics Simulation Trajectories
Abstract: Hydrogen bonds (H-bonds) play a key role in both the formation and stabilization of protein structures. H-bonds involving atoms from residues that are close to each other in the main-chain sequence stabilize secondary structure elements. H-bonds between atoms from distant residues stabilize a protein’s tertiary structure. However, H-bonds greatly vary in stability. They form and break while a protein deforms. For instance, the transition of a protein from a non-functional to a functional state may require some H-bonds to break and others to form. The intrinsic strength of an individual H-bond has been studied from an energetic viewpoint, but energy alone may not be a very good predictor. Other local interactions may reinforce (or weaken) an H-bond. This paper describes inductive learning methods to train a protein-independent probabilistic model of H-bond stability from molecular dynamics (MD) simulation trajectories. The training data describes H-bond occurrences at successive times along these trajectories by the values of attributes called predictors. A trained model is constructed in the form of a regression tree in which each non-leaf node is a Boolean test (split) on a predictor. Each occurrence of an H-bond maps to a path in this tree from the root to a leaf node. Its predicted stability is associated with the leaf node. Experimental results demonstrate that such models can predict H-bond stability quite well. In particular, their performance is roughly 20% better than that of models based on H-bond energy alone. In addition, they can accurately identify a large fraction of the least stable H-bonds in a given conformation. The paper discusses several extensions that may yield further improvements.
Cite this paper: nullI. Chikalov, P. Yao, M. Moshkov and J. Latombe, "Learning Probabilistic Models of Hydrogen Bond Stability from Molecular Dynamics Simulation Trajectories," Journal of Intelligent Learning Systems and Applications, Vol. 3 No. 3, 2011, pp. 155-170. doi: 10.4236/jilsa.2011.33017.

[1]   E. N. Baker, “Hydrogen Bonding in Biological Macromolecules,” International Tables for Crystallography, Vol. F, No. 22, 2006, pp. 546-552.

[2]   A. R. Fersht and L. Serrano, “Principles in Protein Stability Derived from Protein Engineering Experiments,” Current Opinion in Structural Biology, Vol. 3, No. 1, 1993, pp. 75-83. doi:10.1016/0959-440X(93)90205-Y

[3]   D. Schell, J. Tsai, J. M. Scholtz and C. N. Pace, “Hydrogen Bonding Increases Packing Density in the Protein Interior,” Proteins: Structure, Function, and Bioinformatics, Vol. 63, No. 2, 2006, pp. 278-282. doi:10.1002/prot.20826

[4]   B. Honing, “Protein Folding: From the Levinthal Paradox to Structure Prediction,” Journal of Molecular Biology, Vol. 293, No. 2, 1989, pp. 283-293. doi:10.1006/jmbi.1999.3006

[5]   C. N. Pace, “Polar Group Burial Contributes More to Protein Stability than Nonpolar Group Burial,” Biochemistry, Vol. 40, No. 2, 2001, pp. 310-313. doi:10.1021/bi001574j

[6]   Z. Bikadi, L. Demko and E. Hazai, “Functional and Structural Characterization of a Protein Based on Analysis of Its Hydrogen Bonding Network by Hydrogen Bonding Plot,” Archives of Biochemistry and Biophysics, Vol. 461, No. 2, 2007, pp. 225-234. doi:10.1016/

[7]   B. I. Dahiyat, D. B. Gordon and S. L. Mayo, “Automated Design of the Surface Positions of Protein Helices,” Protein Science, Vol. 6, No. 6, 2007, pp. 1333-1337. doi:10.1002/pro.5560060622

[8]   M. Levitt, “Molecular Dynamics of Hydrogen Bonds in Bovine Pancreatic Trypsin Unhibitor Protein,” Nature, Vol. 294, 1981, pp. 379-380. doi:10.1038/294379a0

[9]   K. Morokuma, “Why do Molecules Interact? The Origin of Electron Donor-Acceptor Complexes, Hydrogen Bonding, and Proton Affinity,” Accounts of Chemical Research, Vol. 10, No. 8, 1997, pp. 294-300. doi:10.1021/ar50116a004

[10]   A. J. Rader, B. M. Hespenhelde, L. A. Kuhn and M. F. Thorpe, “Protein Unfolding: Rigidity Lost,” Proceedings of the National Academy of Sciences, Vol. 99, No. 6, 2002, pp. 3540-3545. doi:10.1073/pnas.062492699

[11]   M. A. Spackman, “A Simple Quantitative Model of Hydrogen Bonding,” Journal of Chemical Physics, Vol. 85, No. 11, 1986, pp. 6587-6601. doi:10.1063/1.451441

[12]   M. F. Thorpe, M. Lei, A. J. Rader, D. J. Jacobs and L. A. Kuhn, “Protein Flexibility and Dynamics Using Constraint Theory,” Journal of Molecular Graphics and Modeling, Vol. 19, No. 1, 2001, pp. 60-69. doi:10.1016/S1093-3263(00)00122-4

[13]   I. K. McDonald and J. M. Thornton, “Satisfying Hydrogen Bonding Potential in Proteins,” Journal of Molecular Biology, Vol. 238, No. 5, 1994, pp. 777-793. doi:10.1006/jmbi.1994.1334

[14]   L. Breiman, J. H. Friedman, R. A. Olshen and C. J. Stone, “Classification and Regression Trees,” CRC Press, Boca Raton, 1984.

[15]   M. Levitt, M. Hirshberg, R. Sharon and V. Daggett, “Potential Energy Function and Parameters for Simulations of the Molecular Dynamics of Proteins and Nucleic Acids in Solution,” Computer Physics Communications, Vol. 91, 1995, No. 1-3, pp. 215-231. doi:10.1016/0010-4655(95)00049-L

[16]   J. Srinivasan, M. Trevathan, P. Beroza and D. Case, “Application of a Pairwise Generalized Born Model to Proteins and Nucleic Acids: Inclusion of Salt Effects,” Theoretical Chemistry Accounts, Vol. 101, No. 6, 1999, pp. 426-434. doi:10.1007/s002140050460

[17]   E. Tuv, A. Borisov and K. Torkokola, “Best Subset Feature Selection for Massive Mixed-Type Problems,” Lecture Notes in Computer Science, Springer, Vol. 4224, 2006, pp. 1048-1056. doi:10.1007/11875581_125

[18]   H. Joo, X. Qu, R. Swanson, C. M. McCallum and J. Tsai, “Modeling the Dependency of Residue Packing upon Backbone Conformation Using Molecular Dynamics Simulation,” Computational Biology and Chemistry, Accepted, 2010.

[19]   N. Haspel, D. Ricklin, B. Geisbrecht, J. D. Lambris and E. K. Lydia, “Electrostatic Contributions Drive the Interaction between Staphylococcus Aureus Protein Efb-C and Its Complement Target C3d,” Protein Science, Vol. 17, No. 11, 2008, pp. 1894-1906. doi:10.1110/ps.036624.108

[20]   G. A. Jeffrey and W. Saenger, “Hydrogen Bonding in Biological Structures,” Springer-Verlag, 1991.

[21]   W. W. Clenland, P. A. Frey and J. A. Gerlt, “The Low Barrier Hydrogen Bond in Enzymatic Catalysis,” Journal of Biological Chemistry, Vol. 273, 1998, pp. 25529-25532. doi:10.1074/jbc.273.40.25529

[22]   J. H. Friedman, “Greedy Function Approximation: A Gradient Boosting Machine,” Annals of Statistics, Vol. 29, No. 5, 2000, pp. 1189-1232. doi:10.1214/aos/1013203451