AM  Vol.5 No.1 , January 2014
A Method to Predict Amino Acids at Proximity of Beta-Sheet Axes from Protein Sequences
ABSTRACT

A general and elementary protein folding step was described in a previous article. Energy conservation during this folding step yielded an equation with remarkable solutions over the field of rational numbers. Sets of sequences optimized for folding were derived. In this work, a geometrical analysis of protein beta-sheet backbone structures allows the definition of positions of topological interest. They correspond to amino acids’ alpha carbons located on a unique axis crossing all beta-sheet’s strands or at proximity of this axis defined here. These positions of topological interest are shown to be highly correlated with the absence of sequences optimized for folding. Applications in protein structure prediction for the quality assessment of structural models are envisioned.


Cite this paper
A. Guilloux, B. Caudron and J. Jestin, "A Method to Predict Amino Acids at Proximity of Beta-Sheet Axes from Protein Sequences," Applied Mathematics, Vol. 5 No. 1, 2014, pp. 79-89. doi: 10.4236/am.2014.51009.
References
[1]   C. B. Anfinsen, “Some Observations on the Basic Principles of Design in Protein Molecules,” Comparative Biochemistry and Physiology, Vol. 4, No. 2-4, 1962, pp. 229-240.
http://dx.doi.org/10.1016/0010-406X(62)90007-5

[2]   C. Clementi, “Coarse-Grained Models of Protein Folding: Toy Models or Predictive Tools?” Current Opinion in Structural Biology, Vol. 18, No. 1, 2008, pp. 10-15.
http://dx.doi.org/10.1016/j.sbi.2007.10.005

[3]   L. A. Kelley and M. J. E. Sternberg, “Protein Structure Prediction on the Web: A Case Study Using the Phyre Server,” Nature Protocols, Vol. 4, No. 3, 2009, pp. 363-371.
http://dx.doi.org/10.1038/nprot.2009.2

[4]   S. Y. Lee and J. Skolnick, “Tasser-wt: A Protein Structure Prediction Algorithm with Accurate Predicted Contact Restraints for Difficult Protein Targets,” Biophysical Journal, Vol. 99, No. 9, 2010, pp. 3066-3075.
http://dx.doi.org/10.1016/j.bpj.2010.09.007

[5]   R. Norel, D. Petrey and B. Honig, “Pudge: A Flexible, Interactive Server for Protein Structure Prediction,” Nucleic Acids Research, Vol. 38, Suppl. 2, 2010, pp. W550-554.
http://dx.doi.org/10.1093/nar/gkq475

[6]   A. Leaver-Fay, M. Tyka, S. M. Lewis, et al., “ROSETTA3: An Object-Oriented Software Suite for the Simulation and Design of Macromolecules,” Methods in Enzymology, Vol. 487, 2011, pp. 545-574.
http://dx.doi.org/10.1016/B978-0-12-381270-4.00019-6

[7]   J. Thompson and D. Baker, “Incorporation of Evolutionary Information into Rosetta Comparative Modeling,” Proteins, Vol. 79, No. 8, 2011, pp. 2380-2388. http://dx.doi.org/10.1002/prot.23046

[8]   J. I. Sulkowska, F. Morcos, M. Weigt, et al., “Genomics-Aided Structure Prediction,” Proceedings of the National Academy of Sciences of the United States of America, Vol. 109, No. 26, 2012, pp. 10340-10345. http://dx.doi.org/10.1073/pnas.1207864109

[9]   D. S. Marks, T. A. Hopf and C. Sander, “Protein Structure Prediction from Sequence Variation,” Nature Biotechnology, Vol. 30, No. 11, 2012, pp. 1072-1080. http://dx.doi.org/10.1038/nbt.2419

[10]   I. Iliopoulos, S. Tsoka, M. A. Andrade, et al., “Evaluation of Annotation Strategies Using an Entire Genome Sequence,” Bioinformatics, Vol. 19, No. 6, 2003, pp. 717-726.
http://dx.doi.org/10.1093/bioinformatics/btg077

[11]   A. S. Juncker, L. J. Jensen, A. Pierleoni, et al., “Sequence-Based Feature Prediction and Annotation of Proteins,” Genome Biology, Vol. 10, 2009, p. 206. http://dx.doi.org/10.1186/gb-2009-10-2-206

[12]   B. Rost and C. Sander, “Prediction of Protein Secondary Structure at Better than 70% Accuracy,” Journal of Molecular Biology, Vol. 232, No. 2, 1993, pp. 584-599.
http://dx.doi.org/10.1006/jmbi.1993.1413

[13]   D. Bordo and P. Argos, “The Role of Side-Chain Hydrogen Bonds in the Formation and Stabilization of Secondary Structure in Soluble Proteins,” Journal of Molecular Biology, Vol. 243, No. 3, 1994, pp. 504-519. http://dx.doi.org/10.1006/jmbi.1994.1676

[14]   J. Selbig, T. Mevissen and T. Lengauer, “Decision Free-Based Formation of Consensus Protein Secondary Structure Prediction,” Bioinformatics, Vol. 15, No. 12, 1999, pp. 1039-1046.
http://dx.doi.org/10.1093/bioinformatics/15.12.1039

[15]   D. T. Jones, “Protein Secondary Structure Prediction Based on Position-Specific Scoring Matrices,” Journal of Molecular Biology, Vol. 292, No. 2, 1999, pp. 195-202.
http://dx.doi.org/10.1006/jmbi.1999.3091

[16]   J. Martin, J. F. Gibrat and F. Rodolphe, “Analysis of an Optimal Hidden Markov Model for Secondary Structure Prediction,” BMC Structural Biology, Vol. 6, 2006, p. 25. http://dx.doi.org/10.1186/1472-6807-6-25

[17]   C. A. Floudas, “Computational Methods in Protein Structure Prediction,” Biotechnology and Bioengineering, Vol. 97, No. 2, 2007, pp. 207-213. http://dx.doi.org/10.1002/bit.21411

[18]   L. Mirny and E. Shakhnovich, “Protein Folding Theory: From Lattice to All-Atom Models,” Annual Review of Biophysics and Biomolecular Structure, Vol. 30, 2001, pp. 361-396.
http://dx.doi.org/10.1146/annurev.biophys.30.1.361

[19]   G. D. Rose, P. J. Fleming, J. R. Banavar and A. Maritan, “A Backbone-Based Theory of Protein Folding,” Proceedings of the National Academy of Sciences of the United States of America, Vol. 103, No. 45, 2006, pp. 16623-16633.
http://dx.doi.org/10.1073/pnas.0606843103

[20]   K. A. Dill, S. B. Ozkan, M. S. Shell and T. R. Weikl, “The Protein Folding Problem,” Annual Review of Biophysics, Vol. 37, No. 1, 2008, pp. 289-316.
http://dx.doi.org/10.1146/annurev.biophys.37.092707.153558

[21]   D. Thirumalai, E. P. O’Brien, G. Morrison and C. Hyeon, “Theoretical Perspectives on Protein Folding,” Annual Review of Biophysics, Vol. 39, No. 1, 2010, pp. 159-183. http://dx.doi.org/10.1146/annurev-biophys-051309-103835

[22]   O. B. Ptitsyn, “Molten Globule and Protein Folding,” Advances in Protein Chemistry, Vol. 47, 1995, pp. 83-229.
http://dx.doi.org/10.1016/S0065-3233(08)60546-X

[23]   A. F. Chaffotte, J. I. Guijarro, Y. Guillou, et al., “The ‘Pre-Molten Globule’, a New Intermediate in Protein Folding,” Journal of Protein Chemistry, Vol. 16, No. 5, 1997, pp. 433-439.
http://dx.doi.org/10.1023/A:1026397008011

[24]   J. N. Onuchic, Z. Luthey-Schulten and P. G. Wolynes, “Theory of Protein Folding: The Energy Landscape Perspective,” Annual Review of Physical Chemistry, Vol. 48, 1997, pp. 545-600. http://dx.doi.org/10.1146/annurev.physchem.48.1.545

[25]   R. D. Schaeffer, A. Fersht and V. Daggett, “Combining Experiment and Simulation in Protein Folding: Closing the Gap for Small Model Systems,” Current Opinion in Structural Biology, Vol. 18, No. 1, 2008, pp. 4-9.
http://dx.doi.org/10.1016/j.sbi.2007.11.007

[26]   J. A. Hegler, J. Latzer, A. Shehu, et al., “Restriction versus Guidance in Protein Structure Prediction,” Proceedings of the National Academy of Sciences of the United States of America, Vol. 106, No. 36, 2009, pp. 15302-15307.
http://dx.doi.org/10.1073/pnas.0907002106

[27]   A. Matouschek, J. T. Kellis Jr., L. Serrano and A. R. Fersht, “Mapping the Transition State and Pathway of Protein Folding by Protein Engineering,” Nature, Vol. 340, 1989, pp. 122-126.
http://dx.doi.org/10.1038/340122a0

[28]   A. Guilloux, B. Caudron and J. L. Jestin, “A Method to Predict Edge Strands in Beta-Sheets from Protein Sequences,” Computational and Structural Biotechnology Journal, Vol. 7, 2013, Article ID: e201305001.
http://dx.doi.org/10.5936/csbj.201305001

[29]   H. Ménager, V. Gopalan, B. Néron, S. Larroudé, J. Maupetit, A. Saladin, P. Tufféry, Y. Huyen and B. Caudron, “Bioinformatics Applications Discovery and Composition with the Mobyle Suite and MobyleNet,” Lecture Notes in Computer Science, Vol. 6799, 2012, pp. 11-22.
http://dx.doi.org/10.1007/978-3-642-27392-6_2

[30]   F. C. Bernstein, T. F. Koetzle, G. J. Williams, et al., “The Protein Data Bank. A Computer-Based Archival File for Macromolecular Structures,” European Journal of Biochemistry, Vol. 80, No. 2, 1977, pp. 319-324.
http://dx.doi.org/10.1111/j.1432-1033.1977.tb11885.x

[31]   Y. Lin, J. D. Lusin, D. Ye, et al., “Examination of the Structure, Stability, and Catalytic Potential in the Engineered Phosphoryl Carrier Domain of Pyruvate Phosphate Dikinase,” Biochemistry, Vol. 45, No. 6, 2006, pp. 1702-1711.
http://dx.doi.org/10.1021/bi051816l

[32]   L. Lo Conte, B. Ailey, T. J. Hubbard, et al., “SCOP: A Structural Classification of Proteins Database,” Nucleic Acids Research, Vol. 28, No. 1, 2000, pp. 257-259. http://dx.doi.org/10.1093/nar/28.1.257

[33]   B. K. Ho and P. M. Curmi, “Twist and Shear in Beta-Sheets and Beta-Ribbons,” Journal of Molecular Biology, Vol. 317, No. 2, 2002, pp. 291-308. http://dx.doi.org/10.1006/jmbi.2001.5385

[34]   M. Eigen, B. F. Lindemann, M. Tietze, et al., “How Old Is the Genetic Code? Statistical Geometry of tRNA Provides an Answer,” Science, Vol. 244, No. 4905, 1989, pp. 673-679.
http://dx.doi.org/10.1126/science.2497522

[35]   M. A. Jimenez-Montano, “Protein Evolution Drives the Evolution of the Genetic Code and Vice Versa,” Biosystems, Vol. 54, No. 1, 1999, pp. 47-64. http://dx.doi.org/10.1016/S0303-2647(99)00058-1

[36]   M. Di Giulio, “The Origin of the Genetic Code: Theories and Their Relationships, a Review,” Biosystems, Vol. 80, No. 2, 2005, pp. 175-184. http://dx.doi.org/10.1016/j.biosystems.2004.11.005

[37]   M. Di Giulio, “The β-Sheets of Proteins, the Biosynthetic Relationships between Amino Acids, and the Origin of the Genetic Code,” Origins of Life and Evolution of the Biosphere, Vol. 26, No. 6, 1996, pp. 589-609.
http://dx.doi.org/10.1007/BF01808222

[38]   L. Wang and P. G. Schultz, “Expanding the Genetic Code,” Angewandte Chemie International Edition, Vol. 44, No. 1, 2004, pp. 34-66. http://dx.doi.org/10.1002/anie.200460627

[39]   N. Budisa, “Engineering the Genetic Code,” Wiley-VCH, Weinheim, 2006.

[40]   K. Wang, W. H. Schmied and J. W. Chin, “Reprogramming the Genetic Code: From Triplet to Quadruplet Codes,” Angewandte Chemie International Edition, Vol. 51, No. 10, 2012, pp. 2288-2297. http://dx.doi.org/10.1002/anie.201105016

[41]   Y. B. Rumer, “About the Codon’s Systematization in the Genetic Code,” The Proceedings of the USSR Academy of Sciences, Vol. 167, 1966, pp. 1393-1394.

[42]   V. I. Shcherbak, “Rumer’s Rule and Transformation in the Context of the Co-Operative Symmetry of the Genetic Code,” Journal of Theoretical Biology, Vol. 139, No. 2, 1989, pp. 271-276.
http://dx.doi.org/10.1016/S0022-5193(89)80104-3

[43]   J. L. Jestin, “A Rationale for the Symmetries by Base Substitutions of Degeneracy in the Genetic Code,” Biosystems, Vol. 99, No. 1, 2010, pp. 1-5.
http://dx.doi.org/10.1016/j.biosystems.2009.07.009

[44]   A. Guilloux and J. L. Jestin, “The Genetic Code and Its Optimization for Kinetic Energy Conservation in Polypeptide Chains,” Biosystems, Vol. 109, No. 2, 2012, pp. 141-144.
http://dx.doi.org/10.1016/j.biosystems.2012.03.001

[45]   J. X. Madarasz and G. Szekely, “Special Relativity over the Field of Rational Numbers,” International Journal of Theoretical Physics, Vol. 52, No. 5, 2013, pp. 1706-1718.
http://dx.doi.org/10.1007/s10773-013-1492-8

[46]   L. Pauling and R. B. Corey, “Configurations of Polypeptide Chains with Favored Orientations around Single Bonds: Two New Pleated Sheets,” Proceedings of the National Academy of Sciences of the United States of America, Vol. 37, No. 11, 1951, pp. 729-740.
http://dx.doi.org/10.1073/pnas.37.11.729

[47]   F. R. Salemme, “Structural Properties of Protein β-Sheets,” Progress in Biophysics and Molecular Biology, Vol. 42, 1983, pp. 95-133.
http://dx.doi.org/10.1016/0079-6107(83)90005-6

[48]   C. Chothia, “Conformation of Twisted β-Pleated Sheets in Proteins,” Journal of Molecular Biology, Vol. 75, No. 2, 1973, pp. 295-302. http://dx.doi.org/10.1016/0022-2836(73)90022-3

[49]   E. Koh, T. Kim and H. S. Cho, “Mean Curvature as a Major Determinant of β-Sheet Propensity,” Bioinformatics, Vol. 22, No. 3, 2006, pp. 297-302.
http://dx.doi.org/10.1093/bioinformatics/bti775

[50]   M. J. Sternberg and J. M. Thornton, “On the Conformation of Proteins: An Analysis of β-Pleated Sheets,” Journal of Molecular Biology, Vol. 110, No. 2, 1977, pp. 285-296.
http://dx.doi.org/10.1016/S0022-2836(77)80073-9

[51]   M. J. Sternberg and J. M. Thornton, “On the Conformation of Proteins: Towards the Prediction of Strand Arrangements in β-Pleated Sheets,” Journal of Molecular Biology, Vol. 113, No. 2, 1977, pp. 401-418.
http://dx.doi.org/10.1016/0022-2836(77)90149-8

[52]   M. J. Sternberg and J. M. Thornton, “On the Conformation of Proteins: Hydrophobic Ordering of Strands in β-Pleated Sheets,” Journal of Molecular Biology, Vol. 115, No. 1, 1977, pp. 1-17. http://dx.doi.org/10.1016/0022-2836(77)90242-X

[53]   G. Von Heijne and C. Blomberg, “Some Global β-Sheet Characteristics,” Biopolymers, Vol. 17, No. 8, 1978, pp. 2033-2037.
http://dx.doi.org/10.1002/bip.1978.360170817

[54]   M. A. Wouters and P. M. Curmi, “An Analysis of Side Chain Interactions and Pair Correlations within Antiparallel β-Sheets: the Differences between Backbone Hydrogen-Bonded and Non-Hydrogen-Bonded Residue Pairs,” Proteins, Vol. 22, No. 2, 1995, pp. 119-131.
http://dx.doi.org/10.1002/prot.340220205

[55]   I. Ruczinski, C. Kooperberg, R. Bonneau and D. Baker, “Distributions of Beta Sheets in Proteins with Application to Structure Prediction,” Proteins: Structure, Function, and Bioinformatics, Vol. 48, No. 1, 2002, pp. 85-97.
http://dx.doi.org/10.1002/prot.10123

[56]   J. S. Richardson and D. C. Richardson, “Natural β-Sheet Proteins Use Negative Design to Avoid Edge-to-Edge Aggregation,” Proceedings of the National Academy of Sciences of the United States of America, Vol. 99, No. 5, 2002, pp. 2754-2759.
http://dx.doi.org/10.1073/pnas.052706099

[57]   A. E. Kister, A. S. Fokas, T. S. Papatheodorou and I. M. Gelfand, “Strict Rules Determine Arrangements of Strands in Sandwich Proteins,” Proceedings of the National Academy of Sciences of the United States of America, Vol. 103, No. 11, 2006, pp. 4107-4110. http://dx.doi.org/10.1073/pnas.0510747103

[58]   T. S. Papatheodorou and A. S. Fokas, “Systematic Construction and Prediction of the Arrangement of the Strands of Sandwich Proteins,” Journal of the Royal Society Interface, Vol. 6, No. 30, 2009, pp. 63-73. http://dx.doi.org/10.1098/rsif.2008.0192

[59]   N. Koga, R. Tatsumi-Koga, G. Liu, R. Xiao, T. B. Acton, G. T. Montelione and D. Baker, “Principles for Designing Ideal Protein Structures,” Nature, Vol. 491, No. 7423, 2012, pp. 222-227. http://dx.doi.org/10.1038/nature11600

[60]   C. M. Santiveri, J. Santoro, M. Rico and M. A. Jimenez, “Factors Involved in the Stability of Isolated Beta-Sheets: Turn Sequence, β-Sheet Twisting, and Hydrophobic Surface Burial,” Protein Science, Vol. 13, No. 4, 2004, pp. 1134-1147.
http://dx.doi.org/10.1110/ps.03520704

[61]   B. Caudron and J. L. Jestin, “Sequence Criteria for the Anti-Parallel Character of Protein β-Strands,” Journal of Theoretical Biology, Vol. 315, 2012, pp. 146-149.
http://dx.doi.org/10.1016/j.jtbi.2012.09.011

[62]   M. Brylinski, M. Gao and J. Skolnick, “Why not Consider a Spherical Protein? Implications of Backbone Hydrogen Bonding for Protein Structure and Function,” Physical Chemistry Chemical Physics, Vol. 13, No. 38, 2011, pp. 17044-17055.
http://dx.doi.org/10.1039/c1cp21140d

[63]   J. Cheng and P. Baldi, “Three-Stage Prediction of Protein β-Sheets by Neural Networks, Alignments and Graph Algorithms,” Bioinformatics, Vol. 21, Suppl. 1, 2005, pp. i75-i84.
http://dx.doi.org/10.1093/bioinformatics/bti1004

[64]   R. Rajgaria, Y. Wei and C. A. Floudas, “Contact Prediction for β and alpha-β Proteins Using Integer Linear Optimization and Its Impact on the First Principles 3D Structure Prediction Method ASTRO-FOLD,” Proteins, Vol. 78, No. 8, 2010, pp. 18251846. http://dx.doi.org/10.1002/prot.22696

[65]   Z. Aydin, Y. Altunbasak and H. Erdogan, “Bayesian Models and Algorithms for Protein β-Sheet Prediction,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, Vol. 8, No. 2, 2011, pp. 395-409.
http://dx.doi.org/10.1109/TCBB.2008.140

[66]   A. Subramani and C. A. Floudas, “β-Sheet Topology Prediction with High Precision and Recall for β and mixed α/β Proteins,” PLoS ONE, Vol. 7, No. 3, 2012, Article ID: e32461.
http://dx.doi.org/10.1371/journal.pone.0032461

[67]   N. S. Burkoff, C. Varnai and D. L. Wild, “Predicting Protein β-Sheet Contacts Using a Maximum Entropy-Based Correlated Mutation Measure,” Bioinformatics, Vol. 29, No. 5, 2013, pp. 580-587. http://dx.doi.org/10.1093/bioinformatics/btt005

[68]   R. E. Steward and J. M. Thornton, “Prediction of Strand Pairing in Antiparallel and Parallel β-Sheets Using Information Theory,” Proteins, Vol. 48, No. 2, 2002, pp. 178-191.
http://dx.doi.org/10.1002/prot.10152

[69]   O. Zimmermann, L. Wang and U. H. Hansmann, “BETTY: Prediction of β-Strand Type from Sequence,” In In Silico Biology, Vol. 7, No. 4-5, 2007, pp. 535-542.

[70]   N. Zhang, G. Duan, S. Gao, J. S. Ruan and T. Zhang, “Prediction of the Parallel/Antiparallel Orientation of β-Strands Using Amino Acid Pairing Preferences and Support Vector Machines,” Journal of Theoretical Biology, Vol. 263, No. 3, 2010, pp. 360-368. http://dx.doi.org/10.1016/j.jtbi.2009.12.019

[71]   A. V. Efimov, “Standard Structures in Proteins,” Progress in Biophysics and Molecular Biology, Vol. 60, No. 3, 1993, pp. 201-239. http://dx.doi.org/10.1016/0079-6107(93)90015-C

[72]   C. A. Orengo and J. M. Thornton, “Protein Families and Their Evolution—A Structural Perspective,” Annual Review of Biochemistry, Vol. 74, 2005, pp. 867-900.
http://dx.doi.org/10.1146/annurev.biochem.74.082803.133029

[73]   W. Thiel, “Theoretical Chemistry—Quo Vadis?” Angewandte Chemie International Edition, Vol. 50, No. 40, 2011, pp. 92169217. http://dx.doi.org/10.1002/anie.201105305

[74]   S. C. Lovell, I. W. Davis, W. B. Arendall III, P. I. W. de Bakker, J. M. Word, M. G. Prisant, J. S. Richardson and D. C. Richardson, “Structure Validation by Cα Geometry: φ, ψ and Cβ Deviation,” Proteins, Vol. 50, No. 3, 2003, pp. 437-450.
http://dx.doi.org/10.1002/prot.10286

[75]   B. Wallner and A. Elofsson, “Identification of Correct Regions in Protein Models Using Structural, Alignment, and Consensus Information,” Protein Science, Vol. 15, No. 4, 2006, pp. 900-913.
http://dx.doi.org/10.1110/ps.051799606

[76]   P. Benkert, M. Biasini and T. Schwede, “Toward the Estimation of the Absolute Quality of Individual Protein Structure Models,” Bioinformatics, Vol. 27, No. 3, 2011, pp. 343-350.
http://dx.doi.org/10.1093/bioinformatics/btq662

[77]   D. Fischer, L. Rychlewski, R. L. Dunbrack Jr., A. R. Ortiz and A. Elofsson, “CAFASP3: The Third Critical Assessment of Fully Automated Structure Prediction Methods,” Proteins, Vol. 53, No. S6, 2003, pp. 503-516.
http://dx.doi.org/10.1002/prot.10538

[78]   D. Cozzetto, A. Kryshtafovych, K. Fidelis, J. Moult, B. Rost and A. Tramontano, “Evaluation of Template-Based Models in CASP8 with Standard Measures,” Proteins, Vol. 77, No. S9, 2009, pp. 18-28. http://dx.doi.org/10.1002/prot.22561

 
 
Top