Integrative modeling of gene and genome evolution roots the archaeal tree of life
Tom A. Williams a,b,1, Gergely J. Szöllősi c,2, Anja Spang d,2, Peter G. Foster e, Sarah E. Heaps b,f, Bastien Boussau g, Thijs J. G. Ettema d, and T. Martin Embley b
Author Affiliations
aSchool of Earth Sciences, University of Bristol, Bristol BS8 1TQ, United Kingdom;
bInstitute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, United Kingdom;
cMTA-ELTE Lendület Evolutionary Genomics Research Group, 1117 Budapest, Hungary;
dDepartment of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, SE-75123 Uppsala, Sweden;
eDepartment of Life Sciences, Natural History Museum, London SW7 5BD, United Kingdom;
fSchool of Mathematics & Statistics, Newcastle University, Newcastle upon Tyne NE1 7RU, United Kingdom;
gUniv Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Evolutive UMR5558, F-69622 Villeurbanne, France
Edited by W. Ford Doolittle, Dalhousie University, Halifax, Canada, and approved April 24, 2017 (received for review November 7, 2016)
Significance
The Archaea represent a primary domain of cellular life, play major roles in modern-day biogeochemical cycles, and are central to debates about the origin of eukaryotic cells. However, understanding their origins and evolutionary history is challenging because of the immense time spans involved. Here we apply a new approach that harnesses the information in patterns of gene family evolution to find the root of the archaeal tree and to resolve the metabolism of the earliest archaeal cells. Our approach robustly distinguishes between published rooting hypotheses, suggests that the first Archaea were anaerobes that may have fixed carbon via the Wood–Ljungdahl pathway, and quantifies the cumulative impact of horizontal transfer on archaeal genome evolution.
Abstract
A root for the archaeal tree is essential for reconstructing the metabolism and ecology of early cells and for testing hypotheses that propose that the eukaryotic nuclear lineage originated from within the Archaea; however, published studies based on outgroup rooting disagree regarding the position of the archaeal root. Here we constructed a consensus unrooted archaeal topology using protein concatenation and a multigene supertree method based on 3,242 single gene trees, and then rooted this tree using a recently developed model of genome evolution. This model uses evidence from gene duplications, horizontal transfers, and gene losses contained in 31,236 archaeal gene families to identify the most likely root for the tree. Our analyses support the monophyly of DPANN (Diapherotrites, Parvarchaeota, Aenigmarchaeota, Nanoarchaeota, Nanohaloarchaea), a recently discovered cosmopolitan and genetically diverse lineage, and, in contrast to previous work, place the tree root between DPANN and all other Archaea. The sister group to DPANN comprises the Euryarchaeota and the TACK Archaea, including Lokiarchaeum, which our analyses suggest are monophyletic sister lineages. Metabolic reconstructions on the rooted tree suggest that early Archaea were anaerobes that may have had the ability to reduce CO2 to acetate via the Wood–Ljungdahl pathway. In contrast to proposals suggesting that genome reduction has been the predominant mode of archaeal evolution, our analyses infer a relatively small-genomed archaeal ancestor that subsequently increased in complexity via gene duplication and horizontal gene transfer.
evolution phylogenetics Archaea
Footnotes
1To whom correspondence should be addressed. Email: tom.a.williams@bristol.ac.uk.
2G.J.S. and A.S. contributed equally to this work.
Author contributions: T.A.W., T.J.G.E., and T.M.E. designed research; T.A.W., G.J.S., A.S., P.G.F., S.E.H., and B.B. performed research; G.J.S. and B.B. contributed new reagents/analytic tools; T.A.W., G.J.S., A.S., P.G.F., S.E.H., and B.B. analyzed data; and T.A.W., A.S., T.J.G.E., and T.M.E. wrote the paper.
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
Freely available online through the PNAS open access option.