Abstract

Core gene phylogenies provide a window into early evolution, but different gene sets and analytical methods have yielded substantially different views of the tree of life. Trees inferred from a small set of universal core genes have typically supported a long branch separating the archaeal and bacterial domains. By contrast, recent analyses of a broader set of non-ribosomal genes have suggested that Archaea may be less divergent from Bacteria, and that estimates of inter-domain distance are inflated due to accelerated evolution of ribosomal proteins along the inter-domain branch. Resolving this debate is key to determining the diversity of the archaeal and bacterial domains, the shape of the tree of life, and our understanding of the early course of cellular evolution. Here, we investigate the evolutionary history of the marker genes key to the debate. We show that estimates of a reduced Archaea-Bacteria (AB) branch length result from inter-domain gene transfers and hidden paralogy in the expanded marker gene set. By contrast, analysis of a broad range of manually curated marker gene datasets from an evenly sampled set of 700 Archaea and Bacteria reveals that current methods likely underestimate the AB branch length due to substitutional saturation and poor model fit; that the best-performing phylogenetic markers tend to support longer inter-domain branch lengths; and that the AB branch lengths of ribosomal and non-ribosomal marker genes are statistically indistinguishable. Furthermore, our phylogeny inferred from the 27 highest-ranked marker genes recovers a clade of DPANN at the base of the Archaea and places the bacterial Candidate Phyla Radiation (CPR) within Bacteria as the sister group to the Chloroflexota.

Highlights

  • 42 Much remains unknown about the earliest period of cellular evolution and the deepest divergences in the tree of life

  • We examine the evolutionary history of the 381 gene marker set and identify several features of these genes, including instances of inter-domain gene transfers and mixed paralogy, that may contribute to the inference of a shorter AB branch length in concatenation analyses

  • Manual inspection of subsampled versions of these gene trees suggested that 317/381 did not possess an unambiguous branch separating the archaeal and bacterial domains (Supplementary File 1). These distributions suggest that many of these genes are not broadly present in both domains, and that some might be specific to Bacteria. 134 Conflicting evolutionary histories of individual marker genes and the inferred species 135 tree 136

Read more

Summary

Introduction

42 Much remains unknown about the earliest period of cellular evolution and the deepest divergences in the tree of life. Failure to 92 model site-specific amino acid preferences has previously been shown to lead to under estimation of the AB branch length due to a failure to detect convergent changes (Tourasse and 94 Gouy, 1999; Williams et al, 2020), the published analysis of the 381 marker set did not find evidence of a substantial impact of these features on the tree as a whole (Zhu et al, 2019) Those analyses identified phylogenetic incongruence among the 381 markers, but did not determine the underlying cause (Zhu et al, 2019). We identify a subset of marker genes least affected by these issues, and use these to estimate an updated tree of the primary domains of life and the length of the branch that separates Archaea and Bacteria

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call