Abstract

Almost all standard phylogenetic methods for reconstructing gene trees result in unrooted trees; yet, many of the most useful applications of gene trees require that the gene trees be correctly rooted. As a result, several computational methods have been developed for inferring the root of unrooted gene trees. However, the accuracy of such methods has never been systematically evaluated on prokaryotic gene families, where horizontal gene transfer is often one of the dominant evolutionary events driving gene family evolution. In this work, we address this gap by conducting a thorough comparative evaluation of five different rooting methods using large collections of both simulated and empirical prokaryotic gene trees. Our simulation study is based on 6000 true and reconstructed gene trees on 100 species and characterizes the rooting accuracy of the four methods under 36 different evolutionary conditions and 3 levels of gene tree reconstruction error. The empirical study is based on a large, carefully designed data set of 3098 gene trees from 504 bacterial species (406 Alphaproteobacteria and 98 Cyanobacteria) and reveals insights that supplement those gleaned from the simulation study. Overall, this work provides several valuable insights into the accuracy of the considered methods that will help inform the choice of rooting methods to use when studying microbial gene family evolution. Among other findings, this study identifies parsimonious Duplication-Transfer-Loss (DTL) rooting and Minimal Ancestor Deviation (MAD) rooting as two of the most accurate gene tree rooting methods for prokaryotes and specifies the evolutionary conditions under which these methods are most accurate, demonstrates that DTL rooting is highly sensitive to high evolutionary rates and gene tree error, and that rooting methods based on branch-lengths are generally robust to gene tree reconstruction error.

Highlights

  • Phylogenetic trees, or phylogenies, represent evolutionary relationships between biological entities

  • We found that root positions inferred by Minimal Ancestor Deviation (MAD) and Minimum Variance (MV) rooting showed the most agreement among all assessed methods, while DTL rooting and Amalgamated Likelihood Estimation (ALE) rooting yielded the most divergent root positions compared to MAD and MV rooting

  • (2) ALE rooting consistently performs worse than DTL rooting

Read more

Summary

Background

Phylogenetic trees, or phylogenies, represent evolutionary relationships between biological entities. To the best of our knowledge, there is only one empirical study that uses prokaryotic data to evaluate the accuracy of some of these rooting methods [15] As a result, it is not known how well these methods work for rooting prokaryotic gene trees, where horizontal gene transfer is ubiquitous. Note that it is not possible to directly evaluate the accuracy of each rooting method on empirical data given the innate uncertainty of evolutionary reconstruction This data set was designed such that the placement of the root in gene families conserved in both phyla likely coincides with taxonomic boundaries, i.e., is likely to be placed between Alphaproteobacteria and Cyanobacteria. (6), MAD, MV, and midpoint rooting are generally robust to gene tree reconstruction error and remain effective at rooting even when the gene trees under consideration have significant reconstruction error, while DTL rooting is more sensitive to such errors

Materials and methods
Results
Results of simulation study
Results of empirical data analysis
Discussion and conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call