JILSA  Vol.3 No.3 , August 2011
Learning Probabilistic Models of Hydrogen Bond Stability from Molecular Dynamics Simulation Trajectories
Hydrogen bonds (H-bonds) play a key role in both the formation and stabilization of protein structures. H-bonds involving atoms from residues that are close to each other in the main-chain sequence stabilize secondary structure elements. H-bonds between atoms from distant residues stabilize a protein’s tertiary structure. However, H-bonds greatly vary in stability. They form and break while a protein deforms. For instance, the transition of a protein from a non-functional to a functional state may require some H-bonds to break and others to form. The intrinsic strength of an individual H-bond has been studied from an energetic viewpoint, but energy alone may not be a very good predictor. Other local interactions may reinforce (or weaken) an H-bond. This paper describes inductive learning methods to train a protein-independent probabilistic model of H-bond stability from molecular dynamics (MD) simulation trajectories. The training data describes H-bond occurrences at successive times along these trajectories by the values of attributes called predictors. A trained model is constructed in the form of a regression tree in which each non-leaf node is a Boolean test (split) on a predictor. Each occurrence of an H-bond maps to a path in this tree from the root to a leaf node. Its predicted stability is associated with the leaf node. Experimental results demonstrate that such models can predict H-bond stability quite well. In particular, their performance is roughly 20% better than that of models based on H-bond energy alone. In addition, they can accurately identify a large fraction of the least stable H-bonds in a given conformation. The paper discusses several extensions that may yield further improvements.

Cite this paper
nullI. Chikalov, P. Yao, M. Moshkov and J. Latombe, "Learning Probabilistic Models of Hydrogen Bond Stability from Molecular Dynamics Simulation Trajectories," Journal of Intelligent Learning Systems and Applications, Vol. 3 No. 3, 2011, pp. 155-170. doi: 10.4236/jilsa.2011.33017.
[1]   E. N. Baker, “Hydrogen Bonding in Biological Macromolecules,” International Tables for Crystallography, Vol. F, No. 22, 2006, pp. 546-552.

[2]   A. R. Fersht and L. Serrano, “Principles in Protein Stability Derived from Protein Engineering Experiments,” Current Opinion in Structural Biology, Vol. 3, No. 1, 1993, pp. 75-83. doi:10.1016/0959-440X(93)90205-Y

[3]   D. Schell, J. Tsai, J. M. Scholtz and C. N. Pace, “Hydrogen Bonding Increases Packing Density in the Protein Interior,” Proteins: Structure, Function, and Bioinformatics, Vol. 63, No. 2, 2006, pp. 278-282. doi:10.1002/prot.20826

[4]   B. Honing, “Protein Folding: From the Levinthal Paradox to Structure Prediction,” Journal of Molecular Biology, Vol. 293, No. 2, 1989, pp. 283-293. doi:10.1006/jmbi.1999.3006

[5]   C. N. Pace, “Polar Group Burial Contributes More to Protein Stability than Nonpolar Group Burial,” Biochemistry, Vol. 40, No. 2, 2001, pp. 310-313. doi:10.1021/bi001574j

[6]   Z. Bikadi, L. Demko and E. Hazai, “Functional and Structural Characterization of a Protein Based on Analysis of Its Hydrogen Bonding Network by Hydrogen Bonding Plot,” Archives of Biochemistry and Biophysics, Vol. 461, No. 2, 2007, pp. 225-234. doi:10.1016/j.abb.2007.02.020

[7]   B. I. Dahiyat, D. B. Gordon and S. L. Mayo, “Automated Design of the Surface Positions of Protein Helices,” Protein Science, Vol. 6, No. 6, 2007, pp. 1333-1337. doi:10.1002/pro.5560060622

[8]   M. Levitt, “Molecular Dynamics of Hydrogen Bonds in Bovine Pancreatic Trypsin Unhibitor Protein,” Nature, Vol. 294, 1981, pp. 379-380. doi:10.1038/294379a0

[9]   K. Morokuma, “Why do Molecules Interact? The Origin of Electron Donor-Acceptor Complexes, Hydrogen Bonding, and Proton Affinity,” Accounts of Chemical Research, Vol. 10, No. 8, 1997, pp. 294-300. doi:10.1021/ar50116a004

[10]   A. J. Rader, B. M. Hespenhelde, L. A. Kuhn and M. F. Thorpe, “Protein Unfolding: Rigidity Lost,” Proceedings of the National Academy of Sciences, Vol. 99, No. 6, 2002, pp. 3540-3545. doi:10.1073/pnas.062492699

[11]   M. A. Spackman, “A Simple Quantitative Model of Hydrogen Bonding,” Journal of Chemical Physics, Vol. 85, No. 11, 1986, pp. 6587-6601. doi:10.1063/1.451441

[12]   M. F. Thorpe, M. Lei, A. J. Rader, D. J. Jacobs and L. A. Kuhn, “Protein Flexibility and Dynamics Using Constraint Theory,” Journal of Molecular Graphics and Modeling, Vol. 19, No. 1, 2001, pp. 60-69. doi:10.1016/S1093-3263(00)00122-4

[13]   I. K. McDonald and J. M. Thornton, “Satisfying Hydrogen Bonding Potential in Proteins,” Journal of Molecular Biology, Vol. 238, No. 5, 1994, pp. 777-793. doi:10.1006/jmbi.1994.1334

[14]   L. Breiman, J. H. Friedman, R. A. Olshen and C. J. Stone, “Classification and Regression Trees,” CRC Press, Boca Raton, 1984.

[15]   M. Levitt, M. Hirshberg, R. Sharon and V. Daggett, “Potential Energy Function and Parameters for Simulations of the Molecular Dynamics of Proteins and Nucleic Acids in Solution,” Computer Physics Communications, Vol. 91, 1995, No. 1-3, pp. 215-231. doi:10.1016/0010-4655(95)00049-L

[16]   J. Srinivasan, M. Trevathan, P. Beroza and D. Case, “Application of a Pairwise Generalized Born Model to Proteins and Nucleic Acids: Inclusion of Salt Effects,” Theoretical Chemistry Accounts, Vol. 101, No. 6, 1999, pp. 426-434. doi:10.1007/s002140050460

[17]   E. Tuv, A. Borisov and K. Torkokola, “Best Subset Feature Selection for Massive Mixed-Type Problems,” Lecture Notes in Computer Science, Springer, Vol. 4224, 2006, pp. 1048-1056. doi:10.1007/11875581_125

[18]   H. Joo, X. Qu, R. Swanson, C. M. McCallum and J. Tsai, “Modeling the Dependency of Residue Packing upon Backbone Conformation Using Molecular Dynamics Simulation,” Computational Biology and Chemistry, Accepted, 2010.

[19]   N. Haspel, D. Ricklin, B. Geisbrecht, J. D. Lambris and E. K. Lydia, “Electrostatic Contributions Drive the Interaction between Staphylococcus Aureus Protein Efb-C and Its Complement Target C3d,” Protein Science, Vol. 17, No. 11, 2008, pp. 1894-1906. doi:10.1110/ps.036624.108

[20]   G. A. Jeffrey and W. Saenger, “Hydrogen Bonding in Biological Structures,” Springer-Verlag, 1991.

[21]   W. W. Clenland, P. A. Frey and J. A. Gerlt, “The Low Barrier Hydrogen Bond in Enzymatic Catalysis,” Journal of Biological Chemistry, Vol. 273, 1998, pp. 25529-25532. doi:10.1074/jbc.273.40.25529

[22]   J. H. Friedman, “Greedy Function Approximation: A Gradient Boosting Machine,” Annals of Statistics, Vol. 29, No. 5, 2000, pp. 1189-1232. doi:10.1214/aos/1013203451