Abstract

Molecular phylogenetics and phylogenomics are subject to noise from horizontal gene transfer (HGT) and bias from convergence in macromolecular compositions. Extensive variation in size, structure and base composition of alphaproteobacterial genomes has complicated their phylogenomics, sparking controversy over the origins and closest relatives of the SAR11 strains. SAR11 are highly abundant, cosmopolitan aquatic Alphaproteobacteria with streamlined, A+T-biased genomes. A dominant view holds that SAR11 are monophyletic and related to both Rickettsiales and the ancestor of mitochondria. Other studies dispute this, finding evidence of a polyphyletic origin of SAR11 with most strains distantly related to Rickettsiales. Although careful evolutionary modeling can reduce bias and noise in phylogenomic inference, entirely different approaches may be useful to extract robust phylogenetic signals from genomes. Here we develop simple phyloclassifiers from bioinformatically derived tRNA Class-Informative Features (CIFs), features predicted to target tRNAs for specific interactions within the tRNA interaction network. Our tRNA CIF-based model robustly and accurately classifies alphaproteobacterial genomes into one of seven undisputed monophyletic orders or families, despite great variability in tRNA gene complement sizes and base compositions. Our model robustly rejects monophyly of SAR11, classifying all but one strain as Rhizobiales with strong statistical support. Yet remarkably, conventional phylogenetic analysis of tRNAs classifies all SAR11 strains identically as Rickettsiales. We attribute this discrepancy to convergence of SAR11 and Rickettsiales tRNA base compositions. Thus, tRNA CIFs appear more robust to compositional convergence than tRNA sequences generally. Our results suggest that tRNA-CIF-based phyloclassification is robust to HGT of components of the tRNA interaction network, such as aminoacyl-tRNA synthetases. We explain why tRNAs are especially advantageous for prediction of traits governing macromolecular interactions from genomic data, and why such traits may be advantageous in the search for robust signals to address difficult problems in classification and phylogeny.

Highlights

  • Which parts of genomes are most resistant to compositional convergence? Which information is vertically inherited most faithfully? Compositional stationarity and verticalinheritance are key, yet frequently violated, assumptions of most current approaches in molecular phylogenetics and phylogenomics [1]

  • We have developed ways to predict, from genomic data alone, how tRNAs distinguish themselves to their specific interaction partners

  • We validated our model by classifying hundreds of diverse alphaproteobacterial taxa and tested it on eight strains of SAR11, a phylogenetically controversial group that is highly abundant in the world’s oceans

Read more

Summary

Introduction

Which parts of genomes are most resistant to compositional convergence? Compositional stationarity and vertical (co-)inheritance are key, yet frequently violated, assumptions of most current approaches in molecular phylogenetics and phylogenomics [1]. Advances in understanding the history of life will require discovery of new universal, slowly-evolving phylogenetic markers that are resistant to compositional convergence and HGT. Some recent phylogenomic studies place free-living SAR11 together in a clade with the largely endoparasitic Rickettsiales and the alphaproteobacterial ancestor of mitochondria [6,7,8]. Other studies persuasively argue that this placement is an artifact of independent convergence of SAR11 and Rickettsiales towards increased genomic A+T contents, and that SAR11 are more closely related to the free-living Alphaproteobacteria such as the Rhizobiales and Rhodobacteraceae [9,10,11]. The monophyly of SAR11 was recently rejected [10,12]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.