The standard genetic code is nearly universal, and relates the sequence of a translated messenger RNA (mRNA) to the sequence of amino acids in the resultant protein. A group of three consecutive mRNA nucleotides, called a codon, encode each amino acid. Because mRNA contains four different bases—adenine (A), guanine (G), uracil (U), and cytosine (C)—there are 64 (43) possible combinations for base triplets. In fact, 61 base triplets are used to encode 20 amino acids, and the remaining three triplets, called stop codons, signal the termination of translation. Because 61 codons are used to specify only 20 amino acids, most amino acids are encoded by more than one codon (i.e., the mRNA codons are highly degenerate). The term “synonymous codons” (or synonyms) refers to mRNA codons that specify the same amino acid.
Previous studies have demonstrated that the first two bases (the “root”) of an mRNA codon are more important than the third one (the “ending”) relative to the total stability of the Watson-Crick base pairs between the codon and the anticodon of a transfer RNA (tRNA) (Rumer, 2016a, 2016b, 2016c). There is some steric freedom/wobble in the pairing of the third base of the codon, which is why most synonymous codons differ only in the last base of the triplet. Previous research has associated the first base of the mRNA codon to the precursor from which the amino acid is synthesized, and associated the second position of the triplet to the hydrophobicity of the encoded amino acid (Chiusano et al., 2000; Copley, Smith, & Morowitz, 2005; Di Giulio, 1996, 1997a, 1997b; Lehmann & Libchaber, 2008; Taylor & Coates, 1989; Wong, 1975). To the author’s knowledge, no study has investigated other possible relationships between mRNA codons and the properties of the encoded amino acids; it thus became the objective of this study. The author explored the association between the entire root (i.e., both the first and the second positions of the codon) of the mRNA codon and the hydrophobicity of the encoded amino acid. The author used the eight trigrams from the I Ching as a tool to characterize this relationship.
The I Ching of Fuxi (Huang, 2010), an ancient Chinese classic, was written several thousand years ago. It states that the Great Primal Beginning, Taiji, generates the two primary opposite forces, Yin and Yang. These two forces generate the four images known as bigrams or Sixiang. The four images generate the eight trigrams called Bagua, and there are 64 permutations of the eight trigrams, referred to as “hexagrams”, each of which is a concatenation of two trigrams.
The Yin-Yang concept in I Ching introduced the world’s earliest binary system. Scientists have investigated the possible connection between I Ching and computer code, genetic code, linguistics, oscillatory process, and musical harmony (Bailey, 1982; Castro-Chavez, 2019; Darvas, Koblyakov, Petoukhov, & Stepanyan, 2012; Gerber, 2001; Hu, Petoukhov, & Petukhova, 2017; Igamberdiev & Shklovskiy-Kordi, 2016; Petoukhov, 2016; Schonberger, 1992; Zhang, Chen, Chen, Xu, & Hu, 2018). Previous studies on possible connections between the genetic code and the symbolic system of I Ching have focused on the 64 hexagrams because there are 64 genetic codons (Castro-Chavez, 2011, 2012). This study focuses on the eight trigrams, each of which is composed of three unbroken (solid) and/or broken lines. The author found that the eight trigrams provide a new method for characterizing the association between mRNA codons and the hydrophobicity of the encoded amino acids.
2. Previous Studies on the Association between mRNA Codons and the Hydrophobicity of the Encoded Amino Acids
Previous studies have indicated that the second base of mRNA codons determines the hydrophobicity of the encoded amino acids: The majority codons for hydrophilic amino acids have A in the second position; the majority codons for hydrophobic amino acids have U in the second position (Chiusano et al., 2000; Copley, Smith, & Morowitz, 2005; Lehmann & Libchaber, 2008; Rogers, 2019; Stambuk & Konjevoda, 2019). Seven hydrophilic amino acids, glutamine (Gln), histidine (His), aspartic acid (Asp), glutamic acid (Glu), tyrosine (Tyr), asparagine (Asn), and lysine (Lys), have A in the second position of their mRNA codons, while the other four hydrophilic amino acids, arginine (Arg), cysteine (Cys), threonine (Thr), and serine (Ser), have G or C in their second codon position. Five hydrophobic amino acids, leucine (Leu), valine (Val), phenylalanine (Phe), isoleucine (Ile), and methionine (Met), have U in the second position of their mRNA codons, while the other four hydrophobic amino acids, proline (Pro), alanine (Ala), glycine (Gly), and tryptophan (Trp), have C or G in their second codon position. Overall, eight out of 20 amino acids do not conform to the general pattern (Nelson & Cox, 2017).
3. Research Method/Design
In this study, the author used the eight I Ching trigrams, each of which is composed of three unbroken (solid) and/or broken lines (see the left column of Table 1), as a tool to characterize the relationship between mRNA codons and the hydrophobicity of the encoded amino acids. In the Yin-Yang system, Yang usually represents the stronger matter, and Yin represents the weaker matter. Since the Supreme Yang in the eight trigrams are composed of three unbroken lines, whereas the Supreme Yin includes three broken lines, the author proposes that the solid symbol “–” represents a C or G in an mRNA codon that forms stronger C º G or G º C hydrogen bonds with tRNA anticodons, while the broken symbol “- -” represents a U or A in an mRNA codon that forms weaker U = A or A = U hydrogen bonds with tRNA anticodons.
Table 1 contains three columns: the left column includes the symbol for and name of each trigram; the middle column correlates the three lines in each symbol with the strength of the base-pairing between each of the three bases in an mRNA codon with its anticodon partner; and the final column assigns eight mRNA codons to each trigram. Sorting the 64 mRNA codons among the eight trigrams has produced interesting results.
First, the start codon (AUG) and the stop codons (UAA/UAG/UGA) all fall
Table 1. Characterization of the 64 genetic codons according to the eight I Ching trigrams. Because trigrams are written from bottom to top, the order of the strength of each base-pair bond in column 2 will be reversed in the final column. Amino acids with hydrophilic side chains are shown in bold. The symbols and names of the trigrams are from the I Ching (Huang, 2010). The genetic codons, and the amino acids encoded and their properties are from Nelson and Cox (2017).
within Yin trigrams (e.g., AUG in Mature Yin, UAA in Supreme Yin, UAG in Mature Yin, and UGA in Middle Yin). Previous studies have demonstrated that relaxed secondary structures exist at the translation start and stop sites of mRNA to allow better ribosome recognition (Shabalina, Ogurtsov, & Spiridonov, 2006). The A/U rich composition of the start and stop codons effectively reduces the likelihood of forming stably-folded RNA motifs at these sites, thus facilitating the initiation and termination of translation.
Second, the mRNA codons encoding amino acids with hydrophilic (i.e., polar and/or charged) side chains are mostly clustered in the middle four trigrams (i.e., Middle Yang, Young Yang, Young Yin, and Middle Yin), whereas the codons for amino acids with hydrophobic side chains are mainly located at each end of the trigrams (i.e., Supreme Yang, Mature Yang, Mature Yin, and Supreme Yin).
The author finds that the eight trigrams of the I Ching provide a new method for characterizing the association between the hydrophobicity of amino acids and the first two bases of their mRNA codons. The mRNA codons encoding eight hydrophilic amino acids (Gln, His, Asp, Glu, Cys, Thr, Ser, and Arg) fall within the middle four trigrams; their first two codons are strong followed by weak (i.e., C/G followed by A), or weak followed by strong (i.e., U/A followed by C/G). These arrangements generate intermediate-strength hydrogen bonds between the mRNA codon and tRNA anticodon, and produce secondary mRNA structures that are easily unwound, both of which facilitate protein synthesis.
In addition to the two synonymous codons in Young Yin and Middle Yin, the hydrophilic amino acid Arg has four additional synonymous codons within Supreme Yang and Mature Yang, where the first two bases of the codon are strong followed by strong (i.e., C followed by G). In contrast, the mRNA codons for three other hydrophilic amino acids (Tyr, Asn, and Lys) fall within Supreme Yin and Mature Yin, where the first two bases of their codons are weak followed by weak (i.e., U/A followed by A). Overall, most hydrophilic amino acids are associated with mRNA codons whose first two bases form intermediate or weak hydrogen bonds with tRNA anticodons.
The mRNA codons encoding three hydrophobic amino acids (Pro, Ala, and Gly) are characterized as Supreme Yang and Mature Yang; their first two bases are strong followed by strong (i.e., C/G followed by C/G). In contrast, the codons for hydrophobic Phe, Ile, and Met are in Supreme Yin and Mature Yin; their first two bases are weak followed by weak (i.e., U/A followed by U). Hydrophobic amino acid Leu has six synonymous codons—two fall within Supreme Yin and Mature Yin, the other four are in Middle Yang and Young Yang. The four degenerate codons of hydrophobic Val fall within Middle Yang and Young Yang, and the single codon of hydrophobic Trp is in Young Yin (Table 1). Overall, most hydrophobic amino acids are associated with the mRNA codons located within Yang trigrams where the first two bases of the codon are strong followed by strong, or within Yin trigrams where the first two bases of the codon are weak followed by weak; there are fewer hydrophobic amino acids whose mRNA codons are in the middle four trigrams.
The author noticed that the three hydrophobic amino acids whose codons are at the Yang end of the eight trigrams contain either very short or unique side chains. For example, the side chain of Gly is –H, and that of Ala is –CH3, which are the two shortest side chains among the 20 amino acids. The aliphatic side chain (–CH2CH2CH2–) of Pro is unique because it is bonded to both the nitrogen and the α-carbon atoms, yielding a pyrrolidine ring. Pro will thus significantly influence the architecture of the resultant protein because the cyclic structure of Pro is more conformationally restricted than other amino acids. Why these three hydrophobic amino acids are encoded by mRNA codons that have the strongest interactions with their respective tRNA anticodons is unknown. The presence of C/G followed by C/G as the first two bases of their codons could allow stable mRNA-tRNA recognition; however, these GC-rich mRNA regions may also form complicated secondary structures and hinder the translation process. Further studies are needed to elucidate this seeming contradiction.
In summary, the author assigned the 64 mRNA codons to the eight trigrams of the I Ching, and thus uncovered features regarding the correlation between mRNA codons and the properties of their encoded amino acids. This method may provide a new avenue to explore the origin of the genetic code and the origin of life.
Data Availability Statement
All data generated and analyzed during this study are included in this published article.
This work is supported by the National Science Foundation under Award No. OIA-1458952. Any opinions, findings, and conclusions expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.
 Castro-Chavez, F. (2012). Defragged Binary I Ching Genetic Code Chromosomes Compared to Nirenberg’s and Transformed into Rotating 2D Circles and Squares and into a 3D 100% Symmetrical Tetrahedron Coupled to a Functional One to Discern Start from Non-Start Methionines through a Stella Octangula. Journal of Proteome Science and Computational Biology, 1, 3.
 Castro-Chavez, F. (2019). The Digram I Ching Genetic Code Compresses the Genetic Code into 24 Compatible Main Codons. Biomedical Journal of Scientific & Technical Research, 20, 14834-14843.
 Chiusano, M. L., Alvarez-Valin, F., Di Giulio, M., D’Onofrio, G., Ammirato, G., Colonna, G., & Bernardi, G. (2000). Second Codon Positions of Genes and the Secondary Structures of Proteins. Relationships and Implications for the Origin of the Genetic Code. Gene, 261, 63-69.
 Copley, S. D., Smith, E., & Morowitz, H. J. (2005). A Mechanism for the Association of Amino Acids with Their Codons and the Origin of the Genetic Code. Proceedings of the National Academy of Sciences of the United States of America, 102, 4442-4447.
 Di Giulio, M. (1996). The Beta-Sheets of Proteins, the Biosynthetic Relationships between Amino Acids, and the Origin of the Genetic Code. Origins of Life and Evolution of Biospheres, 26, 589-609.
 Hu, Z., Petoukhov, S. V., & Petukhova, E. S. (2017). I-Ching, Dyadic Groups of Binary Numbers and the Geno-Logic Coding in Living Bodies. Progress in Biophysics & Molecular Biology, 131, 354-368.
 Lehmann, J., & Libchaber, A. (2008). Degeneracy of the Genetic Code and Stability of the Base Pair at the Second Position of the Anticodon. RNA, 14, 1264-1269.
 Rogers, S. O. (2019). Evolution of the Genetic Code Based on Conservative Changes of Codons, Amino Acids, and Aminoacyl tRNA Synthetases. Journal of Theoretical Biology, 466, 1-10.
 Rumer, Y. B. (2016a). Translation of ‘Systematization of Codons in the Genetic Code [I]’ by Yu. B. Rumer (1966). Philosophical Transactions of The Royal Society A, 374, Article ID: 20150446.
 Rumer, Y. B. (2016b). Translation of ‘Systematization of Codons in the Genetic Code [II]’ by Yu. B. Rumer (1968). Philosophical Transactions of The Royal Society A, 374, Article ID: 20150447.
 Rumer, Y. B. (2016c). Translation of “Systematization of Codons in the Genetic Code [III]” by Yu. B. Rumer (1969). Philosophical Transactions of The Royal Society A, 374, Article ID: 20150448.
 Shabalina, S. A., Ogurtsov, A. Y., & Spiridonov, N. A. (2006). A Periodic Pattern of mRNA Secondary Structure Created by the Genetic Code. Nucleic Acids Research, 34, 2428-2437.
 Stambuk, N., & Konjevoda, P. (2019). Determining Amino Acid Scores of the Genetic Code Table: Complementarity, Structure, Function and Evolution. Biosystems, 187, Article ID: 104026.
 Wong, J. T. (1975). A Co-evolution Theory of the Genetic Code. Proceedings of the National Academy of Sciences of the United States of America, 72, 1909-1912.
 Zhang, T., Chen, C. L. P., Chen, L., Xu, X., & Hu, B. (2018). Design of Highly Nonlinear Substitution Boxes Based on I-Ching Operators. IEEE Transactions on Cybernetics, 48, 3349-3358.