We live in the information age. There is a wealth of information around us. Shannon’s presentation of information theory in 1948 occurred at the beginning of the information age (Shannon, 1948). Shannon treated alphabetic messages as information. Meanwhile, the first computer, ENIAC, was launched shortly before Shannon published his information theory (Campbell-Kelly, Aspray, Ensmenger, & Yost, 2018a). Early computers could process only numbers and letters; computers now can process images and sounds (Campbell-Kelly, Aspray, Ensmenger, & Yost, 2018b). Computers convert numbers, letters, images, and sounds into 0 s and 1 s. All things that can be processed by computers can be referred to as information. Computers play a central role in today’s information society.
At the same time, the latter half of the 20th century was an era of molecular biology. In 1953, five years after Shannon published his information theory, Watson and Crick unraveled the DNA double helix, through which the substance of the gene was elucidated (Watson & Crick, 1953). Pairs of a pyrimidine base and a purine base exist inside the DNA double helix. There are two types of pairs: guanine and cytosine pairs; and adenine and thymine pairs. This structure is advantageous for accurate replication. Furthermore, it is easily observable from this structure that the nucleotide sequence is the substantial entity of the genetic information. Inevitably, genetic information is digital information. Dawkins stated, “What is truly revolutionary about molecular biology in the post-Watson-Crick era is that it has become digital” (Dawkins, 1995).
Many genes encode proteins. As there are 20 types of amino acids that make up proteins, four types of bases are not enough. Thus, three bases correspond to one genetic code. There are 64 genetic codes that encode 20 amino acids and the stop codon (Alberts, Johnson, Lewis, Raff, Roberts, & Walter, 2002c). Thus, digital information can be converted. The simplest digital information is the two signals, 0 and 1. For example, alphabets are converted into 7-bit binary numbers using the American Standard Code for Information Interchange, which everyone calls ASCII. Likewise, genetic codes can be converted into 6-bit binary numbers. Thus, the alphabet and the genetic code are isomorphic. The surprising truth is that the genetic code is similar to the alphabet. The genetic code is common to most living organisms and is thought to have been determined in the very early stages of the evolution of living organisms (Alberts, Johnson, Lewis, Raff, Roberts, & Walter, 2002d). Genetic information is considered the oldest information.
According to the second law of thermodynamics, the entropy of the universe is always increasing. However, life seems to counteract the second law of thermodynamics. Life keeps order and proliferates and evolves. In this paper, we shall consider this contradiction from the viewpoint of information. Firstly, life maintains the order in its body by using information and energy. This feature is specific to life and life structures such as human buildings, beehives, beaver dam, and bird nests. They keep the order of them. It is consistent with the second law of thermodynamics that life uses information and energy to maintain order because life is an open system.
Secondly, we shall consider the reason why living organisms can preserve information despite the second law of thermodynamics. Natural selection plays a central role in the preservation of information. Natural selection uses both life and death. When life is alive, life proliferates and retains genetic information. When life dies, its body is rapidly degraded and genetic information is lost. Both of them increase the number of advantageous genes for survival and eliminate disadvantageous genes for survival.
2. Natural Selection
Natural selection is a core concept of Darwinian evolution. The premise of natural selection is that life increases very rapidly (Darwin, 2019). In the origin of species, Darwin says “A struggle for existence inevitably follows from the high rate at which all organic beings tend to increase”. Living organisms tend to increase exponentially, and eventually will reach numbers that cannot be supported by resources on Earth. I quote Darwin’s consideration of elephants.
The elephant is reckoned to be the slowest breeder of all known animals, and I have taken some pains to estimate its probable minimum rate of natural increase: it will be under the mark to assume that it breeds when thirty years old, and goes on breeding till ninety years old, bringing forth three pair of young in this interval; if this be so, at the end of the fifth century there would be alive fifteen million elephants, descended from the first pair.
As mentioned above, even if elephants increase indefinitely, they will fill the earth in a short period of time. Therefore, the struggle for existence is inevitable, and many lives are killed. That is, natural selection involves the deaths of many lives.
Bacteria have a fast growth rate and intense survival competition. We shall consider bacteria as a typical natural selection. Black’s microbiology textbook (Black, 2015) has the following statement:
Thermophilic sulfur bacteria find zones of optimum growth temperatures in the runoff troughs of geysers. Different species collect at various locations along the sides of the trough. The most heat-tolerant are near the geyser, and those with lesser heat tolerance are distributed in regions where the water has cooled to their optimum temperature.
There are bacteria that prefer high temperature, and they are called thermophilic bacteria. Geyser is a hot spring that intermittently spouts boiling water. Thermophilic sulfur bacteria live there, and surprisingly there are bacteria that adapt to any temperature in the runoff troughs of geysers. Bacteria adapt to the environment surprisingly quickly because the rate of bacterial growth is very fast, resulting in more mutations. Inevitably, there are a lot of bacterial deaths for adaptation.
3. The Preservation of Information
In 1944, Schrodinger published “What Is Life?”, which marked the beginning of molecular biology. This book inferred the molecular structure of the gene and led to Watson and Crick’s discovery (Schrodinger, 1967a). In addition, Schrodinger stated, “living matter evades the decay to equilibrium” (Schrodinger, 1967b). This is an essential feature of life. Thus, it seems that living organisms escape the second law of thermodynamics. Schrodinger summed up discussions that occurred about this problem. However, he was yet unaware of information theory because Shannon published it four years later. From a modern point of view, this represented a major hindrance to his work. Nevertheless, he stated, “‘drinking orderliness’ from a suitable environment seems to be connected with the presence of the ‘aperiodic solids’” (Schrodinger, 1967c). In doing so, he noticed the core of the problem. Schrodinger’s aperiodic solids were DNA.
Specifically, given that the precise blueprints of biomolecules are stored in DNA, life can be kept in order. From a modern perspective, life can maintain complex structures based on the information stored in DNA. In addition, the action of biological macromolecules, such as enzymes and membrane proteins, has prevented life from falling into equilibrium. In other words, life maintains order using the information stored in DNA.
Notwithstanding life stores information, it is difficult to store information. No matter how we store information, we must maintain a state of being out of equilibrium. However, the second law of thermodynamics is applicable. Any isolated system goes into equilibrium, such that information will be lost in equilibrium. In other words, the permanent preservation of information is impossible.
However, the preservation of information by life is overwhelmingly reliable. According to the neutral theory of evolution (Kimura, 1968), natural selection generally preserves genetic information. Thus, the amino acid sequences of important proteins for survival are highly conserved throughout evolution. For example, Histone H4 is a protein that constitutes a eukaryotic chromosome. Given the importance of histone H4 for survival, almost all of the amino acid changes of histone H4 are fatal. As a result, variants of the amino acid sequence of histone H4 are nearly always eliminated by natural selection. Comparing the amino acid sequences of calf and pea, histone H4 shows that only two amino acid residues are different among 102 amino acid residues (DeLange et al., 1969). It is assumed that animals and plants diverged from a common ancestor about 1.2 billion years ago (Alberts, Johnson, Lewis, Raff, Roberts, & Walter, 2002a).
Furthermore, because the small subunit of ribosomal RNA is very well conserved, it has been used to classify three major domains of living organisms. In addition, 239 common gene families have been found in three major domains of living organisms. The genes common to the three domains of these organisms are estimated to have been conserved for over 3 billion years (Alberts, Johnson, Lewis, Raff, Roberts, & Walter, 2002b).
In his book, Michael Lynch (Lynch, 2007) stated, “the total amount of DNA in living organisms is on the order of 1025 km, which is equivalent to a distance of 1012 light-years, or 10 times the diameter of the known universe”. This number is so astronomical that it is difficult to conceptualize the extensive amount of information that has been accurately copied. Moreover, DNA evolves over time rather than deteriorates.
Although DNA is a stable molecule, it cannot store information for long periods of time by itself. Natural selection is necessary for the preservation of information. If it were not for natural selection, genetic information would be quickly lost.
Let us consider a simplified example. The mean error rate of DNA replication is about one base per 109 bases. Under laboratory conditions, Escherichia coli divides once every 30 minutes (Alberts, Johnson, Lewis, Raff, Roberts, & Walter, 2002d). If E. coli divided at this rate without natural selection, all bases of E. coli’s DNA would change within one million years. This is the power of the second law of thermodynamics. No matter how sophisticated the copy system is, miscopying cannot be prevented entirely. As a result, if copying is repeated many times, miscopying will occur and its effects will accumulate.
4. Natural Selection Eliminates Fatal Mutation through Death
One of the major characteristics of life is that it proliferates. From a gene-centric perspective, a large number of copies of a gene are left. However, no matter how accurately the copy is made, miscopy will accumulate if copying occurs repeatedly. Therefore, without natural selection, life cannot prevent the accumulation of miscopies.
As noted above, if E. coli proliferates without natural selection, genetic information will be lost within a million years. This is the power of the second law of thermodynamics. To record information, the system must always be biased from equilibrium. The system will approach equilibrium after each copy. When the copy accuracy is r and the number of copies is n, Equation (1) is established. As n increases without limit, rn will approach 0. Since r is always less than 1, information will eventually be lost.
The most important point is that the end of life is irreversible. Natural selection eliminates deleterious mutations using the death of living organisms. Even with infinite energy, we could not revive a dead cell. Natural selection irreversibly removes miscopies using death.
If there is a gene in which all mutations are lethal, the gene should not mutate at all. In fact, the amino acid sequence of histone H4 has hardly changed during eukaryotic evolution. The important bases of the small subunit of ribosomal RNA are conserved throughout the entire evolutionary period of life. These cases correspond to r = 1 in Equation (2). The information is conserved no matter how many times it is copied.
Therefore, natural selection enables the preservation of genetic information despite the second law of thermodynamics. Life preserved significant quantities of old information. Furthermore, natural selection has evolved life and new information was created.
It is important that both life and death are features of living organisms. As long as there is life, information will be retained and copied accurately. Inversely, when life ends, genetic information is irreversibly lost. In light of both life and death, natural selection is possible. The first condition of information preservation is that life aims to retain genetic information. The second condition of information preservation is that death irreversibly eliminates disadvantageous information.
There is a deep relationship between entropy and information, and there has been significant discussion about the differences between them. This problem has a long history and there have been many considerations since Maxwell first discussed it (Leff & Rex, 2002). Furthermore, Shannon’s entropy of information and Boltzmann’s entropy have isomorphic equations. Arieh Ben-Naim presented Jaynes’ maximum-uncertainty method, in which the elimination of Botzmann’s constant rendered entropy connected to missing information dimensionless (Ben-Naim, 2008). He claimed that entropy can be substituted with missing information. Here, missing information indicates uncertainty.
When we accept Ben-Naim’s claim, life is viewed as maintaining order using stored information. Specifically, life is thought to have escaped equilibrium due to preserved information. For example, if a protein of a living organism breaks, the living organism will make the same protein using a blueprint contained in the DNA. In this manner, a living organism uses information to maintain order as long as it lives. By contrast, upon death, order in the body is lost. It becomes a substance and follows the second law of thermodynamics. Such characteristics of life form the basis of evolution. Given that individuals with advantageous genes tend to survive, advantageous genes will proliferate. On the other hand, disadvantageous genes are eliminated by death.
Next, consider the information that has been preserved by human civilization. Dawkins referred to replicators in human culture as memes (Dawkins, 2016). As a familiar example, consider a movie on DVD, which is a kind of meme. We put our favorite movies on DVDs, place the DVDs in a case, and store them in a safe place. We keep them away from strong electricity or magnetism. Thus, the owner understands the properties of the DVD and uses this knowledge to store the movie on it. Furthermore, the owner will copy the media content before it ages. However, if a DVD is full, the content will be disposed of to make room for new content. When the owner discards a movie on DVD, it becomes garbage and follows the second law of thermodynamics. As a result, the information on the DVD is irreversibly lost. This process resembles natural selection.
In this way, it has become clear that the preservation of information by human civilization resembles the preservation of information by life. Both compensate for the increase of missing information with information. Only life or human civilization uses information to maintain order. In life, disadvantageous information is eliminated by death. In human civilization, unnecessary information is discarded.
Finally, we shall consider the relationship between information and the second law of thermodynamics. Figure 1 shows an example of an irreversible process in an insulating container. Left side of Figure 1 shows the initial state. A heated stone and a block of ice are in an insulating container. Heat will be transferred from the heated stone to the block of ice through the air. Right side of Figure 1 shows the final state. Ice melts into water and the system reaches thermal equilibrium. The air, the water and the stone are at the same temperature. This process is irreversible in the closed system. However, it is not always irreversible in the open system. If the initial state is recorded in detail, we can restore the system using energy. Firstly, after thermal equilibrium is reached, we open the insulating container. Next, we pick up the stone and heat it. Next, we transfer the water to a suitable container and place it in the freezer. After waiting for a while, we will get again the heated stone and the block of ice. Therefore, if we have information about the initial state and enough energy, we can restore the initial state from equilibrium. That is, we can keep order using energy and information in an open system.
Firstly, it is consistent with the second law of thermodynamics that life uses information and energy to maintain the order of the living body because life is an open system. In terms of maintaining order using information and energy, there is no thermodynamically big difference between maintaining a living organism by life and maintaining a car. We put gasoline in the car if we run out of gas, and if the car breaks down, we repair it. We use information and energy to maintain our cars.
Secondly, the preservation of information by natural selection is consistent with the second law of thermodynamics. The key feature of living organisms is that there are two phases: life and death. The dual nature of life enables natural selection. When life is alive, genetic information is retained. In contrast, when life dies, genetic information is lost. Because life increases exponentially, a lot of copies of genes are made. Inevitably, survival competition occurs. As a result, advantageous genes for survival are retained, and disadvantageous genes for survival are excluded by deaths of individuals.
Figure 1. An example of an irreversible process in an insulating container.
In conclusion, life is not thermodynamically special. However, a major research issue remains. Why is life trying to maintain its order by using information and energy? I can’t answer this question.
 Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K., & Walter, P. (2002a). The Diversity of Genomes and the Tree of Life. In Molecular Biology of the Cell (4th ed., pp. 13-28). New York: Garland Science.
 Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K., & Walter, P. (2002b). The Maintenance of DNA Sequences. In Molecular Biology of the Cell (4th ed., pp. 235-238). New York: Garland Science.
 Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K., & Walter, P. (2002d). The Maintenance of DNA Sequences. In Molecular Biology of the Cell (4th ed., pp. 365-373). New York: Garland Science.
 Ben-Naim, A. (2008). The Structure of the Foundations of Statistical Thermodynamics. In A Farewell to Entropy: Statistical Thermodynamics Based on Information (pp. 221-250). Singapore: World Scientific Publishing Co. Pte. Ltd.
 Campbell-Kelly, M., Aspray, W., Ensmenger, N., & Yost, J. R. (2018a). Inventing the Computer. In Computer: A History of Information Machine (3rd ed., pp. 63-95). New York: Routledge.
 Campbell-Kelly, M., Aspray, W., Ensmenger, N., & Yost, J. R. (2018b). Broadening the Appeal. In Computer: A History of Information Machine (3rd ed., pp. 251-273). New York: Routledge.
 Darwin, C. (2019). Struggle for Existence. In On the Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life (pp. 55-72). Seattle, WA: Amazon Classics.
 DeLange, R. J., Fambrough, D. M., Smith, E. L., & Bonner, J. (1969). Calf and Pea Histone IV. 3. Complete Amino Acid Sequence of Pea Seedling Histone IV; Comparison with the Homologous Calf Thymus Histone. Journal of Biological Chemistry, 240, 5669-5679.
 Leff, H. S., & Rex, F. A. (2002).Overview. In Maxwell’s Demon 2: Entropy, Classical and Quantum Information, Computing (pp. 1-39). Boca Raton, FL: CRC press.