1.1. The Gorilla Lineage as the “Missing Link”
“During many years I collected notes on the origin or descent of man, without any intention of publishing on the subject, but rather with the determination not to publish as I thought that I should thus only add to the prejudices against my views.”―Charles Darwin, 1871
Genome sequencing has been evolving along the law of accelerating returns , the total amount of sequence data produced doubling approximately every seven months . With the genetic revolution, phylogenetic relationships are no longer limited to morphological characters; they can instead be read like an open book. The fossil record, when combined with genomics, can reveal an evolutionary history that was unimaginable based on just morphological analyses. This thesis will explore a new chapter that shows how hominin evolution is not a single continuous lineage, instead the hybridization of two separate lineages, separated over millions of years, whose genomes recombined into the hybrid lineages Paranthropus and Australopithecus. Curiously, that hybridization also accounts for the “missing link”; the hybridization of two lineages explains the absence of a single continuous lineage.
The protagonist of the thesis is a single gene, a pseudogene on chromosome 5, tentatively called “ps5” that originates from the mitochondrial genome and belongs to a class of genes which have unique properties for tracing hybridization where it would have otherwise been impossible to read [3 - 5]. This pseudogene alone provides definitive evidence that there was gene transfer between Gorilla, Pan and Homo at the time of the Pan-Homo split.
With clear evidence of introgression, the rest of the genetic trail of hybridization can be read with ease, standing on a strong foundation of indisputable proof.
In early screening of mitochondrial pseudogenes within the human genome, a pseudogene sequence on chromosome 5 was discovered , which later turned out to be a large (~9 kb) NUMT, tentatively called “ps5” . With advances in genome sequencing of Gorilla and Pan, the same ~9 kb pseudogene sequence was discovered at homologous chromosomal positions in both those lineages, while it was absent in Pongo.
The pseudogene, when compared to mitochondrial branches of Gorilla, Pan and Homo, is shown to have diverged between the three lineages not at the Gorilla/Pan-Homo split, rather at the Pan-Homo split  (Figure 3), clear evidence that there was gene transfer between the three lineages at that time.
The ps5 pseudogene shares affinities with the gorilla lineage mtDNA  which suggests that it originated in the gorilla lineage. With the probability of a NUMT insertion being unaffected by hybridization, it is clear that the insertion happened prior to the introgression event, and that the pseudogene had been evolving in the gorilla lineage for a period of time before introgressing into Pan and Homo .
With high availability of genetic data for both mitochondrial DNA and the pseudogene sequence, the exact history of ps5 can be read by comparing mutations within all three lineages.
The ratio of synonymous to non-synonymous mutations is a marker to distinguish between coding and non-coding gene sequences, because non-synonymous mutations are selected against until the gene is inactivated . For the “stem” of the ps5 pseudogene (the mutations that have accumulated prior to its divergence into three lineages), the fraction of coding (“mitochondrial”) mutations to non-coding (“pseudogenic”) mutations is 3/4 .
The mutation rate in the mitochondrial genome is significantly higher than in the nuclear genome, which means that the 25% pseudogenic mutations have needed proportionally longer time to accumulate. With the estimate of 10× higher mutation rates in mtDNA  and 3× more “mitochondrial” mutations, it took 3.3× longer to accumulate the “pseudogenic” mutations, giving a rough estimate of the insertion happening at 1.8 Myr after the Gorilla/Pan-Homo split, 4.2 Myr before the introgression event that led to the Pan-Homo split (Figure 3).
1.3. Insights into Hominin Evolution from the Gorilla Genome Project
The Gorilla Genome Project was the first complete genome of Gorilla, from a female western lowland gorilla, and it revealed a closer relationship between humans and gorilla than what morphological analyses had shown: in 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other. At the time interpreted as incomplete lineage sorting , the ps5 NUMT as definitive evidence of gene transfer between Gorilla, Pan and Homo around the time of the Pan-Homo split , shows that the lineage sorting is more parsimonious as a result of introgression.
Introgression may lead to speciation, in which the new hybrid lineages become reproductively isolated from parental populations , and since Pan and Homo have diverged through lineage sorting, with 15% of the introgressed genes ending up in Pan and another 15% in Homo, it is reasonable to conclude that the introgression caused the Pan-Homo split (Figure 1), and therefore that it occurred at the time of the Pan-Homo split, around 6 million years ago.
1.4. Paranthropus: A Companion to Australopithecus
With conclusive evidence that introgression from Gorilla caused the Pan-Homo split, it can also be seen that Paranthropus and Australopithecus, as two separate lineages, both speciated as a result of introgression from the Gorilla lineage (Figure 2). The lineage sorting seen in Pan and Homo  can be predicted for Paranthropus as well, with the gorilla-like features, such as strong muscles of mastication, being a result of lineage sorting from the introgression of Gorilla (Figure 1), conserved because the browsing adaptations that are seen in Gorilla were co-opted for grazing , in convergent evolution with other species in the Afar region, such as Eurygnathohippus  and Theropithecus , both grass-eating species descended from browsers.
1.5. The Burtele Foot (BRT-VP-2/73) and Au. Deyiremeda, a Paranthropus?
The discovery of 3.2 - 3.5 million year old hominin fossils that show divergent evolution from Au. afarensis from the same time period [15 , 16], featuring an abductable great toe (Figure 4) instead of the human-like hallux of Au. afarensis, a human-like transverse arch that stiffens the foot , instead of the transitional arch of Au. afarensis that is in-between Homo and Pan, and jaws and teeth that shares characteristics with Paranthropus and Homo  suggested the classification of a new species Australopithecus deyiremeda, meaning “close relative” in the local Afar language.
The definitive proof that introgression caused the speciation both Paranthropus and Australopithecus shows that Au. deyiremeda is better classified as a Paranthropus, P. deyiremeda, and that an early split between Paranthropus and Australopithecus, via the same lineage sorting that is seen in Pan and Homo, is the reason there were two separate lineages of hominins during the Pliocene [15 - 18], clearly distinguishable by their locomotor adaptation and diet.
Figure 1. Phylogenetic tree showing how introgression caused the speciation of humans. This introgression speciation model predicts an early split for Paranthropus and Australopithecus, increasingly shown in the fossil record [15 - 18] and also shows that the evolution of genes that ended up in Australopithecus, and therefore in extant humans, as well as in Paranthropus, can and should be traced along the gorilla lineage as well.
Figure 2. Introgression from Gorilla caused the speciation of both Australopithecus and Paranthropus, and means that traits that have evolved independently in the gorilla lineage were transferred into the hybrid lineages. Paranthropus are often described as “gorilla-like”, they have sagittal crests which suggest strong muscles of mastication, and broad, grinding herbivorous teeth, that led to the name “nutcracker man” for Paranthropus boisei who lived between 2.4 - 1.4 Ma.
Figure 3. Joint phylogenetic tree of hominine mtDNA and the ps5 pseudogene of mtDNA. Black and pink lines depict the mitochondrial and the pseudogene lineages respectively, diverging from their mitochondrial common ancestor. The insertion of mtDNA fragments into the nuclear genome of Gorilla can be roughly estimated to 1.8 Myr after the Gorilla/Pan-Homo split, and the transfer to Pan and Homo to the human-chimpanzee split, along with 30% of the Gorilla genome.
2.1. Pan-Homo Split via Gorilla Introgression
The lineage sorting of 30% of the gorilla genome that is seen in humans and chimpanzees  is a result of introgression, an event that caused the speciation of Pan and Homo (Figure 1), and the two
Figure 4. The Burtele foot, BRT-VP-2/73, found in 2009  in Burtele at Woranso-Mille, Afar, tentatively assigned Au. deyiremeda , contemporaneous with Au. afarensis, shows distinct locomotor adaptation as it retains a grasping hallux, in contrast to the human-like adducted hallux that had developed in Australopithecus afarensis. The conclusive evidence that hominin evolution was caused by introgression from Gorilla suggests that Au. deyiremeda is better classified as Paranthropus deyiremeda. With a revised taxonomic classification, building on a combination of genomic data and fossil records, it can be predicted that Paranthropus and Australopithecus, like Pan and Homo,diverged through lineage sorting as the two lineages co-opted genes from the Gorilla lineage to adapt for separate niches.
lineages diverged through lineage sorting with 15% of the introgressed genes ending up in Pan and another 15% in Homo.
2.2. Paranthropus and Australopithecus Were Hybrid Lineages
Traits within Paranthropus that resemble Gorilla, such as the sagittal crest (Figure 5), are more parsimonious as a result of the introgression event rather than convergent evolution, and lineage sorting similar to the 30% of the Gorilla genome that displays lineage sorting with Pan and Homo , which supports the hypothesis of Paranthropus as a lineage that also speciated from the introgression (Figure 1).
2.3. The Taxonomic Classification of Paranthropus deyiremeda
The combination of data from genome sequencing with the fossil record provides an insight into how Paranthropus and Australopithecus are related, and shows that both lineages speciated as a result of introgression from Gorilla, and provides a foundation for the taxonomic classification of Paranthropus deyiremeda.
The foot stiffness in Au. deyiremeda  is not a preserved character, it is a derived character that is absent in the Au. afarensis lineage as well as in Pan and Gorilla, and that exists together with an abducted great toe, and is contemporary with an adducted (human-like) hallux as a derived feature in Au. afarensis , substantial adaptive differences that had accumulated over significant time spans of divergent
Figure 5. Paranthropus aethiopicus, 2.8 - 2.3 Ma, with gorilla-like sagittal cranial crests as an attachment for strong muscles of mastication, a dietary adaptation. The genetic proof of an introgression event at the time of the Pan-Homo spit shows that the most parsimonious origin for those features within Paranthropus was lineage sorting from the introgression event, originating in Gorilla, rather than convergent evolution. Image from the public domain (CC BY-SA 3.0).
evolution, indisputable data for that Au. deyiremeda is a separate lineage that had adapted for a separate niche, which is also what justified its original classification as a “close relative” . The denthognathic features that are similar to Paranthropus  suggest similar dietary adaptations, and within the hypothesis of introgression as a cause of speciation, the most parsimonious explanation is lineage sorting from the introgression event, with adaptations for browsing such as large muscles of mastication that were co-opted for grazing .
The speciation of Hominini was caused by introgression of Gorilla, and Pan, Australopithecus and Paranthropus diverged as a result of lineage sorting (Figure 1), each branch adapting for a separate ecological niche. With a strong foundation built on the genome revolution, the fossil record reveals a clear picture of two separate lineages of hybrids, Australopithecus and Paranthropus that co-exist throughout the Pliocene and Pleistocene, and that retained separate traits from the hybridization event. The fossil evidence of Paranthropus deyiremeda shows that by the mid-Pliocene, the two lineages had developed separate derived features, both in foot morphology and mastication, which are later on both found within Homo, in other words, Homo has integrated traits that were separated in the ancestral Paranthropus and Australopithecus lineages.
The indisputable evidence that introgression from Gorilla caused the speciation of Pan and Homo is made possible by the genome revolution, centered around the mitochondrial pseudogene “ps5”, and it provides a map, a reference frame that makes it possible to read the world in ways that were previously out of sight, and can provide an important reference for continued research into hominin evolution. What remains to be understood is what environmental and ecological factors triggered the hybridization.
4. Materials & Methods
4.1. Lineage Sorting and the Ps5 NUMT
Phylogenetic relationships can be read from genome comparison. Mitochondrial pseudogenes within the nuclear genome, that originate from mtDNA, provide an ideal marker for tracing hybridization events over large evolutionary time scales, and the ps5 NUMT in Gorilla, Homo and Pan has preserved a record of an event in hominin evolution that, when combined with the fossil record as well as genome analyses as a whole, shows a clear trail of introgression from Gorilla at the Pan-Homo split, and that this hybridization was what caused the speciation of hominins.
The Ps5 homologs in Gorilla, Pan and Homo, when compared to their mitochondrial genomes, shows that it formed from mtDNA at a point after the Gorilla/Pan-Homo split, and that it originated in the Gorilla lineage, with a rough estimate of insertion into the nuclear genome 1.8 Myr after the Gorilla/Pan-Homo split, and that it evolved within the nuclear genome of Gorilla over 3.3× the time period it accumulated “mitochondrial” mutations, to then be transferred to the common ancestor of Pan and Homo during the hybridization event, where ps5 is a small but important record of that event.
With the exponential growth rate of genome sequencing, the amount of sequence data produced doubling approximately every seven months , there is full genome sequences for both Homo, Pan and Gorilla [10 , 19 , 20], and the comparison of all three lineages showed, quite unexpectedly, that in 30% of the Gorilla genome, gorilla is closer to human or chimpanzee than the latter are to each other. In other words, there is a genomic record of lineage sorting between Pan and Homo for 30% of the Gorilla genome, with 15% ending up in Pan and another 15% in Homo.
Knowing that there was gene transfer at the time of the Pan-Homo split, the lineage sorting is most parsimonious as a result of introgression of Gorilla, in a hybridization event that also transferred the ps5 NUMT from Gorilla to the common ancestor of Pan and Homo, and that led to the Pan-Homo split as the two lineages diverged through lineage sorting of the introgressed genes (Figure 1).
4.2. The Fossil Record Combined with Genomics
With ps5 as definitive proof of gene transfer at the Pan-Homo split, and Pan and Homo showing lineage sorting from Gorilla providing clear evidence of introgression, a combination of that data from genome sequencing together with the fossil record can provide an insight into how Paranthropus and Australopithecus are related.
The lineage sorting seen in Pan and Homo  can be predicted for Paranthropus as well, with gorilla-like features, such as a sagittal crest from strong muscles of mastication, a result of lineage sorting from the introgression of Gorilla (Figure 1), conserved because the browsing adaptations seen in Gorilla were co-opted for grazing .
Through the combination of genomics and the fossil record, a foundation for the taxonomic classification of Paranthropus deyiremeda can be constructed, with clear evidence of divergent morphological features in P. deyiremeda and Au. afarensis, which fits perfectly with lineage sorting between the two hybrid lineages.
The taxonomic classification of P. deyiremeda extends the fossil record of the Paranthropus lineage backwards in time to the mid-Pliocene, 3.5 Mya, and shows a clear record of that by the mid-Pliocene, the hybrid lineages Australopithecus and Paranthropus had adapted to separate niches, each lineage conserving its own set of traits from their two parental lineages.
5. Author Summary
The exact nature of the original speciation event leading to the origin of the human and chimpanzee lineages is unknown. With advances in genome sequencing, and two decades of data on Homo, Pan, and Gorilla, there is now conclusive evidence that introgression from Gorilla caused the human-chimpanzee split, and that Homo and Pan diverged through lineage sorting with 15% of the introgressed genes ending up in Homo and another 15% in Pan. The definitive proof comes in the form of a NUMT (“nuclear mitochondrial DNA segment”) on chromosome 5, tentatively called “ps5” that was transferred from Gorilla to Homo and Pan at the time of the Pan-Homo split. The reason “ps5” provides definitive proof that mitochondrial pseudogenes like ps5 have the property that they can be compared with the mitochondrial genome, making it possible to compare the time when ps5 diverged between Gorilla, Homo and Pan to when the mtDNA of Gorilla, Homo and Pan diverged.
Johan Nygren is an independent researcher and the author of the thesis and all content in it.
 Kurzweil, R. (2004) The Law of Accelerating Returns. In: Teuscher, C., Ed., Alan Turing: Life and Legacy of a Great Thinker, Springer, Berlin Heidelberg, 381-416.
 Stephens, Z.D., Lee, S.Y., Faghri, F., Campbell, R.H., Zhai, C., Efron, M.J., et al. (2015) Big Data: Astronomical or Genomical? PLOS Biology, 13, e1002195.
 Hazkani-Covo, E., Zeller, R.M. and Martin, W. (2010) Molecular Poltergeists: Mitochondrial DNA Copies (Numts) in Sequenced Nuclear Genomes. PLoS Genetics, 6, e1000834.
 Li-Sucholeiki, X.-C., Khrapko, K., André, P.C., Marcelino, L.A., Karger, B.L. and Thilly, W.G. (1999) Applications of Constant Denaturant Capillary Electrophoresis/High-Fidelity Polymerase Chain Reaction to Human Genetic Analysis. Electrophoresis, 20, 1224-1232.
 Popadin, K., Gunbin, K., Peshkin, L., Annis, S., Fleischmann, Z., Kraytsberg, G., et al. (2017) Mitochondrial Pseudogenes Suggest Repeated Inter-Species Hybridization in Hominid Evolution. Cold Spring Harbor Laboratory.
 Scally, A., Dutheil, J.Y., Hillier, L.W., Jordan, G.E., Goodhead, I., Herrero, J., et al. (2012) Insights into Hominid Evolution from the Gorilla Genome Sequence. Nature, 483, 169-175.
 Cerling, T.E., Mbua, E., Kirera, F.M., Manthi, F.K., Grine, F.E., Leakey, M.G., et al. (2011) Diet of Paranthropus boisei in the Early Pleistocene of East Africa. Proceedings of the National Academy of Sciences, 108, 9337-9341.
 Melcher, M., Wolf, D. and Bernor, R.L. (2013) The Evolution and Paleodiet of the Eurygnathohippus feibeli Lineage in Africa. Paläontologische Zeitschrift, 88, 99-110.
 Levin, N.E., Haile-Selassie, Y., Frost, S.R. and Saylor, B.Z. (2015) Dietary Change among Hominins and Cercopithecids in Ethiopia during the Early Pliocene. Proceedings of the National Academy of Sciences, 112, 12304-12309.
 Haile-Selassie, Y., Gibert, L., Melillo, S.M., Ryan, T.M., Alene, M., Deino, A., et al. (2015) New Species from Ethiopia Further Expands Middle Pliocene Hominin Diversity. Nature, 521, 483-488.
 Haile-Selassie, Y., Saylor, B.Z., Deino, A., Levin, N.E., Alene, M. and Latimer, B.M. (2012) A New Hominin Foot from Ethiopia Shows Multiple Pliocene Bipedal Adaptations. Nature, 483, 565-569.
 Haile-Selassie, Y., Melillo, S.M. and Su, D.F. (2016) The Pliocene Hominin Diversity Conundrum: Do More Fossils Mean Less Clarity? Proceedings of the National Academy of Sciences, 113, 6364-6371.