When the first and second laws of thermodynamics have been established, Maxwell has raised the question about the compatibility of life with the second law in terms of a sorting demon  . This question has been discussed by many physicists. For example, Szilard suggests that the demon actually transforms “information” into “negative entropy”  . In 1949, Brillouin  has summarized different opinions of life among physicists, chemists and biologists into the following three groups. 1) Our present knowledge of physics and chemistry is practically complete, and these physical and chemical laws will soon enable us to explain life. 2) We know a great deal about physics and chemistry, but it is presumptuous to pretend that we know all about them. We admit that life obeys all the laws of physics and chemistry at present known to us, but we definitely feel that something more is needed before we can understand life. 3) The principles of thermodynamics, especially the second law, apply only to inert objects. Life is an exception to the second law, and a new principle of life will have to explain conditions contrary to the second law.
According to the suggestion by Szilard  , Brillouin  has then analysed the operation of Maxwell’s demon that at least one quantum of light emitted from electric torch is scattered by a molecule and is absorbed in the eye of the demon. In this analysis, he shows that the increase in entropy of the demon by accepting the light quantum is greater than the decrease in entropy by acquiring the information about the molecule, and concludes that the demon ultimately dies by the repetition of such operation. Schrödinger has pointed out that the organism acquires negative entropy by assimilating food  , but he has not considered how the food restores the demon. Although the thermodynamics is extended afterwards to treat the irreversible process far from equilibrium, this approach is directed to evaluate the entropy production arising from the dissipative structure  , and does not inquire into the problem how the organism maintains and further strengthens its lower entropy state. This is also the case for the hypercycle proposed by a series of catalytic reactions  . Any of these approaches to life lacks the consideration of self-reproduction. The self-reproduction is discussed in terms of automata  , but the problem of Maxwell’s demon is left untouched in this mathematical model.
Recently, the studies of molecular biology have revealed the central dogma in the free-living organism; the proteins are translated from messenger RNAs (mRNAs) by the aid of ribosomal RNAs (rRNAs) and transfer RNAs (tRNAs), and the three kinds of RNAs are all transcribed from the DNA genes, which are replicated upon self-reproduction  . These proteins play the respective roles in acquiring the material and energy sources from the outside and in converting them into the biomolecules for the growth and self-reproduction of the organism, according to their specific amino acid sequences. To represent these molecular events in an organism, a new thermodynamic quantity of “biological activity” has been proposed previously   . In the present paper, this quantity will be first outlined again and then shown to be useful for explaining the maintenance and extension of negative entropy in organisms by the evolution through self-reproduction.
2. The Thermodynamic Quantity of Biological Activity
A free-living organism is generally characterized by two internal variables; the size N of its genome (a set of genes) and the systematization SN of the genome and its products. The systematization is the degree of negative entropy, ―SN, which should be measured for the arrangement of deoxynucleotide bases in genes, the arrangement of amino acid residues in the proteins which is transmitted from the genes, the metabolic pathways formed by the catalytic reactions of enzyme proteins, the regulation and control of translation, transcription and replication of DNAs, the cell structure constructed by the intrinsic property of lipid molecules to form a vesicle of lipid bilayer and by the cell wall to support the lipid vesicle, and for furthering the communication between differentiated cells in the case of a multicellular organism.
The energy acquired by such an organism depends on its genome size N and systematization SN as well as on the material and energy source M available from the environment. Thus, the acquired energy is expressed to be , which is an increasing function of N and SN as well as M. On the other hand, the organism utilizes the material and energy to produce the biomolecules for its growth and self-reproduction. The energy Es(N, SN) stored in the biomolecules such as polynucleotides, proteins, lipids and cell wall is also another increasing function of N and SN. These biomolecules to exhibit biological functions have the higher energy than the inorganic molecules, into which they are finally decomposed, and the energy stored in these biomolecules should be measured in comparison with the energy of the decomposed state. The difference between the acquired energy and the stored energy, − Es(N, SN), is released as heat. If the entropy production by the heat compensates for the entropy reduction―SN due to the systematization, this is consistent with the second law of thermodynamics. In the organism, the acquired energy is transiently trapped in ATP and NADPH molecules as chemical energy, and it is gradually consumed in the syntheses of other biomolecules under the guidance of enzyme proteins without drastic change in temperature T. Thus, the following quantity will be proposed as biological activity BA.
The positive value of this quantity has been illustrated for simple organisms and also estimated for higher organisms except for the developmental stage of the latter  . In the multicellular eukaryotes such as animals and seed plants, the parent endows the egg or seed with the material and energy source and the endowed energy is used for the development of cell differentiation until the cooperative action of differentiated cells begins to acquire the energy and materials from the outside. Thus, BA retains the positive value in such higher organisms throughout their lifetime if includes the endowed energy. The larger value of BA means that the biological process tends to proceed more smoothly, and this quantity is considered to be proportional to the self-reproducing rate of an organism as the first approximation. The biological activity has the thermodynamic connotation as a departure from equilibrium, but this is in a reverse relation to the well-known “free energy” which decreases upon any change in a given system by the decrease in internal energy and/or the increase in entropy. This is due to the property of the genome that is almost constant during the lifetime of an organism but changes enough to increase the biological activity by the evolution during longer time, as will be shown in the next section.
3. Evolution of Unicellular Organisms
To illustrate simply that organisms maintain and further extend their negative entropy, we consider the behaviour of unicellular organisms that self-reproduce, sometimes mutate, and die. The set of internal variables (N, SN) of the organism will be simply denoted by a single variable x, unless the description of changes in the genome is necessary. In the population of such organisms taking a common material and energy source M from the outside, the number nxi(t) of i-th variants xi changes with time t according to the following equation of replicator dynamics.
Here, the self-reproducing rate and death rate of the variant xi are denoted by R(M; xi) and D(xi), respectively. The apparent decrease factor Qxi(t) for the self-reproducing rate of variant xi means the mutation of variant xi to other kinds of variants and is related with the mutation term qxj,xi(t) in the following way.
The population behaviour becomes transparent by transforming Equation (2) into two types of equations; one concerning the total number B(t) of all kinds of variants defined by
and another concerning the fraction fxi(t) of variants xi defined by nxi(t)/B(t). By simple calculation, these equations are obtained from Equation (2) to the following forms, respectively, using the relation (3).
Here, qxi,xi(t) is defined by Qxi(t) − 1. The increase rate W(M; xi) of the variant xi is defined by
and the average increase rate of organisms in the population is defined by
If the mutation term is the point mutation such as the nucleotide base change in a gene, this only changes SN in a definite size of genome and the time change in fraction can be evaluated in the following way by the first order of mutation term. When the increase rate of an occasionally arisen mutantxi is larger than the average increase rate of the population, that is, W(M; xi) − Wav(M; t) > 0, the fraction fxi(t) of the mutant xi increases with time according to the first term on the right side of Equation (6). This raises the average increase rate Wav(M;t), resulting in the increase in the total number B(t) of organisms according to Equation (5), although this increase in B(t) is ultimately stopped by the decrease in available material and energy source M. On the other hand, the fraction fxi(t) decreases when W(M;xi) − Wav(M;t) < 0. Thus, the organisms taking a common material and energy source M are elaborated by the mutation and above selection, and most of them reach the ones xopt with the optimum increase rate.
This is Darwinian evolution of unicellular organisms. This evolution is first proposed qualitatively to explain the generation of new species from the observation of unique species in a geographically isolated region and of domestic animals and plants  . After the discovery of mutants and the rediscovery of Mendelian heredity, Darwinian evolution is mathematically formulated in the population genetics to estimate the probability that a spontaneously arisen mutant is fixed in the population according to its selective parameter value   . Although these studies of multicellular organisms have hardly influenced the physical and chemical approaches to life, the present derivation of Darwinian evolution indicates that this evolution converges the nucleotide bases in genes to the special arrangement exhibiting the optimum increase rate and is the essentially important process to maintain the negative entropy of an organism. While RNAs and proteins are continuously transcribed and translated, respectively, to make up for their destruction, the DNA genome also suffers damages, especially in the nucleotide bases. However, such damaged bases are repaired and proofread. By this molecular mechanism, nucleotide bases are substituted with the rate of u ~10−9 per site per year in both prokaryotes and eukaryotes    . The prokaryote carries the genome, whose size s is of the order of 106 base pairs (bp)  , and it only suffers the base changes with the probability of sut = 10−3 after one year. During this period, the prokaryote repeats many times of self-reproduction. Thus, most of the descendants retain the original genome and only a small fraction of descendants receive changed bases. Among the descendants receiving changed bases, some show the higher increase rate while the others are defective or selectively neutral, and the ratio of the former to the latter decreases in the mature population. It depends on the environment what arrangement of nucleotide bases in genes is the best one, and the new species are generated in Darwinian evolution by geographical isolation and/or climate change. This is essentially the same for the multicellular diploid eukaryotes discussed in the last section, although the reproduction of these eukaryotes through hybridization makes the evolutionary process somewhat complicated.
The gene and genome sequencing started in the latter half of last century has brought new information about the evolution of organisms. The amino acid sequence similarities of paralogous proteins strongly suggest that the repertoire of protein functions has been expanded by gene duplication and by the succeeding changes in the counterpart of duplicated genes due to the nucleotide base changes, partial deletion and/or insertion, and domain shuffling     . For the theoretical formulation of this evolution by gene duplication, the concept of biological activity is especially useful, and this evolution is also derived from Equation (6) as far as the unicellular organisms are concerned. The gene duplication enlarges the genome size to N + ΔN. This increases the stored energy to , while the value of acquired energy is almost the same as that of . Thus, the biological activity of the variant having experienced gene duplication is first lowered than that of the original style organism, especially when the counterpart of duplicated genes loses the original function by its change in nucleotide bases. However, such variants are not necessarily compelled to extinction but can exist as minor members in the population. Moreover, new function(s) can arise from such a changing gene. If the product(s) of such new gene(s) become suitable for the variant to acquire a new material and energy source L and/or the ability of moving to a new area, the acquired energy turns to increase, overcoming the increase in systematization from SN to SN+ΔN as well as the increased stored energy . Thus, a new style organism appears with the recovered biological activity .
For formulating mathematically the above evolutionary route, the fraction of variants with the lower increase rate has to be considered more accurately than in Darwinian evolution, after the organisms xopt become dominant in the population, because the gene duplication occurs less frequently than the nucleotide base change. For this purpose, Equation (6) will be formally integrated with respect to time t, i.e.,
Among the fractions fxi(t)’s in this expression, we focus on the fraction fxi1(t) of the variants xi1, which have arisen from the dominant organism xopt by the duplication of one kind of gene. Then, the average increase rate, Wav(M; τ) and Wav(M; τ’), is approximated to be W(M, xopt) and the fractions of other variants expect for xopt are neglected on the right side of Equation (9). The mutation term qxi1,xopt(τ) is averaged over a sufficiently long time to be regarded as the rate of gene duplication;
This quantity qxi1,xopt is used as the occurrence rate of gene duplication. Even in this large time scale, the fraction f(xi1) of variants xi1 is present with the following relation to the fraction f(xopt) of dominant organisms xopt as a semi-stationary state.
The fraction f(xi2) of the variants xi2, which have further experienced the second kind of gene duplication, is also obtained from Equation (9). By focusing on the mutation term qxi2,xi1(τ) of fxi1(τ) on the right side of this equation, the fraction f(xi2) of variants xi2 is shown to be related with the fraction f(xi1) in the following way.
By repeating the similar procedure, the fraction f(xin) of variants xin, which have experienced n kinds of gene duplication, is obtained as
A new style of organisms y can appear from such a minor member xin by changing the counterparts of n kinds of duplicated genes into new genes. When this change probability is denoted by , the probability that a new style organism y appears from the original style organism xo is expressed as
Here, xopt in Equations (11-1) - (11-n) is replaced by or , with the meaning of original.
If the new style organisms y are elaborated to utilize the material and energy source M more efficiently by Darwinian evolution, they compel the original style organisms x to be extinct. If the new style organisms y utilize a new material and energy source L other than M, on the contrary, they form a new population, where the fraction fyk(t) of variant yk obeys the following equation, apart from the population of original style organisms xi in Equation (6).
Here, the average increase rate of new and original styles of organisms is defined by
and the total number B(t) of new and original style organisms obeys
This divergence of new and original styles of organisms can occur without geographical isolation and/or climate change. Generally, the material and energy sources L and M are not completely independent to each other but are connected through the circular flow of materials in the global scale. This gives the fundamentals for forming an ecological system, which is also considered to be the systematization of organisms and environment  .
The evolution by gene duplication is investigated systematically at the molecular level about the O2-releasing photosynthesis and O2-respiration in eubacteria, which produce the circular flow of oxygen molecules. In addition to the amino acid sequence similarities of the proteins constituting these systems to the ubiquitous proteins  , the intermediate stages of eubacteria on the way to the photosynthesis and O2-respiration are also detected  . This suggests that the evolution by gene duplication in prokaryotes has taken place in a relatively stepwise manner, i.e., the suffix n of the probability (12) is a small number and the process from Equation (6) to Equation (13) has been repeated to accomplish the O2-releasing photosynthesis and O2-respiration  . The above mathematical formulation also holds for the evolution of chemical syntheses in other prokaryotes and for the evolution of unicellular eukaryotes. The evolution of organisms before and after the unicellular organisms in the DNA-RNA-protein world will be discussed in the next section.
4. Conclusions and Discussion
In an organism, the entropy production due to the heat released from the difference between acquired energy and stored energy compensates for the negative entropy, which is so designed as to self-reproduce itself by the special arrangement of nucleotide bases in DNA genes. This is realized by the difference in stability and function between DNAs, RNAs and proteins. These molecular events in such an organism are represented by a thermodynamic quantity of biological activity. This quantity more directly reflects the genome change in an organism than the “characters” such as shape, colour, size etc. used customarily in evolutionary biology and the change in this quantity is the better measure to formulate any type of evolutionary process of organisms. The negative entropy in an organism is maintained through the selection of self-reproduced organisms. Moreover, the organisms have the potential to extend the range of negative entropy, even if their increase rate is transiently lowered. This is illustrated for unicellular organisms in the present paper, using the concept of biological activity and the equation of replicator dynamics containing the mutation terms. This concept and the equation are specific to the organism, and they correspond to the answer to the opinion (B) among the three opinions summarized by Brillouin  .
The present mathematical formulation of self-reproducing unicellular organisms also has a possibility to explain the origin of genes. Although the RNA replicase has been proposed as the start of life  , various organic compounds including nucleotides and amino acids would have been synthesized non-biologically when the RNA replicase appeared. Some satellite variants of RNA replicases would have catalyzed the polymerization of amino acids as primitive rRNA and tRNA   . Even if several amino acids are only polymerized at this stage, the random sequences of amino acid residues amount to enormously many kinds of polypeptides. When five of 20 kinds of amino acids are randomly polymerized, for example, 205 kinds of polypeptide chains are yielded and they sufficiently cover the active centres of enzyme proteins in a free living organism at the present time. By the catalytic reactions of these polypeptides, therefore, it is plausible that primitive cells are formed and self-reproduce in the following way, as inferred from the relation between lipid synthesis and the cell wall construction in the prokaryote     . 1) First, lipid synthesis occurs to form intrinsically lipid vesicles each containing the polynucleotides and polypeptides to synthesize lipids and cell wall elements. 2) The cell wall elements synthesized within the vesicle are then transported to the outside across the lipid bilayer and form the cell wall network by the catalytic activity of the polypeptides on the outer surface of the vesicle. 3) When the lipid synthesis progresses faster than the enlargement of cell wall, the density of lipid bilayer becomes gradually higher to prevent the newly synthesized cell wall elements from passing through the lipid bilayer of the vesicle, resulting in the increase in the concentration of lipids and cell wall elements within the vesicle. 4) Thus, the division of the vesicle occurs spontaneously to lower the free energy by increasing the ratio of surface to volume. The density of lipid bilayer is lowered in each of divided vesicles, and this again makes it possible for the cell wall elements to be transported to the outside. 5) The network of cell wall is newly constructed especially in the interspace between the divided vesicles to be connected with the old area of the network distant from the interspace but the old area of cell wall outside the newly constructed cell wall is finally broken to separate the divided vesicles each covered with cell wall, because this area is far from the polypeptides catalysing the connection of cell wall elements on the respective surfaces of divided vesicles.
Such self-reproducing proto-cells also obey the equation of replicator dynamics with the following mutation terms. If the first formed proto-cells are specified by xi’s on their RNA contents, these RNA contents would have changed upon their replication by the primitive RNA polymerase as well as RNA replicase activity. Thus, most of proto-cells are elaborated to xopt by Darwinian evolution, especially with respect to the primitive rRNA and tRNAs to catalyze the polymerization of amino acids. However, this evolution increases the concentration of primitive RNA polymerases and thus increases the concentration of non-functional RNAs in the cell. This yields the variant proto-cells xi1, in which the increased non-functional RNAs interfere with the primitive tRNAs and rRNAs by the hydrogen bonds transiently formed between their nucleotide bases. Such variant cells are declined to the minor members in the population but produce the translation apparatus after the following steps of variation. A part (anticodon) of tRNA originally embracing the side chain of amino acid residue alternatively attaches to the complementary bases in non-functional RNAs which become later the codon of RNA genes of proteins. Such ancestral RNA gene then comes into the contact with the other type of RNAs which become later the initiation complex. The primitive rRNA is enlarged to form the sites for the successive acceptance of amino acid residues carried by tRNAs aligned on ancestral genes of proteins. After the above series of variant cells, xi2, xi3, , xin, a new style of self-reproducing cells, in which L-type of amino acids are polymerized according to the sequence of codons in ancestral RNA genes, appear with the probability such as Equation (12). In this new style of cells, the polymerization rate of amino acids is raised in comparison with the random collision of charged tRNA and primitive rRNA, and the substituted bases in the ancestral RNA genes are selected so as to encode the proteins for raising the increase rate of the cell by Darwinian evolution, compelling the original proto-cells xo to be extinct.
The deoxidization of RNAs would have then occurred in some of such new style cells that began the glycolysis releasing protons. The DNAs thus generated would have been first rubbish and decreases the increase rate of such variant cells. If the optimum RNA content of the self-reproducing cell with the translation apparatus is rewritten into xopt, the internal variable of the variant cell suffering deoxidization corresponds to xi1. For utilizing the DNAs as genes, the variant cells xi1 must have further experienced the succeeding steps of variation xi2, xi3, , xin to induce the genes of proteins for the transcription and replication of DNAs as well as of auxiliary proteins for the unwinding of double-stranded DNAs. Such induction of DNA-associated genes from the RNA-associated genes is suggested from the fact that DNA-dependent DNA polymerases and RNA polymerases form a protein superfamily together with the RNA-dependent RNA polymerases in RNA viruses  and that the DNA helicases show the similarities to RNA helicases in a superfamily   . After the parallel usage of RNA genes and DNA genes, the decisive turning point for the variant cells xin to enter the DNA-RNA-protein world would have been the direct or indirect attachment of DNAs under replication to the cytoplasmic membrane in cooperation with the cell division. This mechanism not only makes the cell division mechanical but also makes it possible for the replicated DNAs to be equi-partitioned into daughter cells. In the case of direct attachment to the membrane, the DNA genes are gradually fused to a single circular molecule, leading to the appearance of ancestral prokaryote. On the other hand, DNA genes in the ancestral eukaryote get together to the plural number of chromosomes and each of chromosomes under replication is attached to the membrane through contractive microtubules. In this innovation, the probability in Equation (12) is divided into and where yp and ye denote the prokaryotic and eukaryotic styles of DNA genomes, respectively. Accordingly, Equation (13) of fraction fyk(t) is also divided into the equation of fraction fypk(t) and that of fraction fyek(t).
In fact, the analyses of neutral base changes in rRNAs reveal that all free-living organisms are traced back to the ancient divergence of prokaryote and eukaryote, probably occurred 4 × 109 years ago   . Among them, the prokaryote has first diverged to evolve chemical syntheses, photosynthesis and O2-respiration  . During this period, the ancestral eukaryote probably lived as the predator of prokaryotes, evolving nuclear membrane, endocytosis and exocytosis. Such living style and cell structure have made it possible for the eukaryote to acquire the mitochondria as the endosymbionts of O2-respiratory eubacteria around 1.8 × 109 years ago  and to acquire further photosynthetic plastids as the endosymbionts of cyanobacteria  . Some of such eukaryoteshaving acquired endosymbionts have then evolved multicellularity and cell differentiation, especially in green plants and animals whose divergence is estimated to have occurred around 1.2 × 109 years ago   . These multicellular organisms have further evolved to take the diploid state through the intermediate stages of alternating the monoploid generation with the diploid one. This is due to the following situation. Although the cooperative action of differentiated cells is a powerful strategy to raise the acquired energy and to spread the living area of the organisms, the genome of each cell in the multicellular organism is expanded to 108 bp or more, e. g., ~108 bp in Arabidopsis and Drosophila, 3 × 109 bp in Homo sapience and ~1010 bp in Taxodials, while 1.2 × 107 bp in the unicellular Saccharomyces  . Nevertheless, the duration time of differentiated cells has to be elongated against nucleotide base changes to exhibit the cooperative action effectively. In the diploid cell, it suffers serious influence only when base substitutions occur concurrently at homologous sites. Thus, the diploid eukaryote consisting of N cells retains the following number of cells not suffering base substitutions at any pair of homologous s sites in their genome during time t; . For example, this number is calculated to be N(1 - 10−5) even during one hundred years for the genome size s = 109 bp using u = 10−9 per site per year. The life-times of these multicellular diploid eukaryotes seem to be secondarily regulated within the duration time estimated from the genome size, because they are considerably different even between closely related species. If the base substitution rate became slower, the cell differentiation would have more evolved in the monoploid state. However, much more energy than that for taking the diploid state is required for the increase in the times of proofreading to lower the substitution rate  . The multicellular diploid eukaryotes produce the children by the combination of egg and sperm (fertilization). This way of reproduction not only decreases the fraction of children receiving defective nucleotide bases homologously but also yields the descendants receiving various combinational sets of new genes generated from gene duplication in different lineages of parents, causing the explosive divergence on the way to establish the respective new genes    . This explains the punctuated mode of explosive divergence of body plans in animals, which is first pointed out by paleontology    and then ascertained to have occurred during the period of 12~4 × 108 years ago by the study of molecular evolution  .
While the organisms in the DNA-RNA-protein world have evolved overcoming the decrease in organic compounds synthesized non-biologically, the organisms in the RNA-protein world would have turned to utilize the translation apparatus as well as nucleotides and amino acids in the prokaryotes and eukaryotes, and survive as RNA phages and viruses under the common codon usage. In this connection, it should be noted that the biological activity is degenerate, that is, almost the same strength of biological activity can be attained by either a small genome, low systematization and a small amount of acquired energy or a large genome, high systematization and a large amount of acquired energy.