Abstract

BackgroundDuplication-Transfer-Loss (DTL) reconciliation is a powerful and increasingly popular technique for studying the evolution of microbial gene families. DTL reconciliation requires the use of rooted gene trees to perform the reconciliation with the species tree, and the standard technique for rooting gene trees is to assign a root that results in the minimum reconciliation cost across all rootings of that gene tree. However, even though it is well understood that many gene trees have multiple optimal roots, only a single optimal root is randomly chosen to create the rooted gene tree and perform the reconciliation. This remains an important overlooked and unaddressed problem in DTL reconciliation, leading to incorrect evolutionary inferences. In this work, we perform an in-depth analysis of the impact of uncertain gene tree rooting on the computed DTL reconciliation and provide the first computational tools to quantify and negate the impact of gene tree rooting uncertainty on DTL reconciliation.ResultsOur analysis of a large data set of over 4500 gene families from 100 species shows that a large fraction of gene trees have multiple optimal rootings, that these multiple roots often, but not always, appear closely clustered together in the same region of the gene tree, that many aspects of the reconciliation remain conserved across the multiple rootings, that gene tree error has a profound impact on the prevalence and structure of multiple optimal rootings, and that there are specific interesting patterns in the reconciliation of those gene trees that have multiple optimal roots.ConclusionsOur results show that unrooted gene trees can be meaningfully reconciled and high-quality evolutionary information can be obtained from them even after accounting for multiple optimal rootings. In addition, the techniques and tools introduced in this paper make it possible to systematically avoid incorrect evolutionary inferences caused by incorrect or uncertain gene tree rooting. These tools have been implemented in the phylogenetic reconciliation software package RANGER-DTL 2.0, freely available from http://compbio.engr.uconn.edu/software/RANGER-DTL/.

Highlights

  • Duplication-Transfer-Loss (DTL) reconciliation is a powerful and increasingly popular technique for studying the evolution of microbial gene families

  • We analyze a large data set of over 4500 gene families from 100 species and (i) show that a large fraction of gene trees have multiple optimal rootings, (ii) show that these multiple roots often, but not always, appear clustered together in the same region of the gene tree, (iii) define the notion of a consensus reconciliation which captures the variability in the reconciliation due to multiple gene tree rootings, (iv) compute consensus reconciliations and use them to show that many aspects of the reconciliation remain conserved across the multiple rootings, and (v) demonstrate that gene tree error has a profound impact on the prevalence and structure of multiple optimal rootings

  • Our set of RAxML gene trees represents a “default” set of gene trees constructed using a standard, commonly used method for gene tree construction, while the set of TreeFix-DTL trees represents a more accurate set of gene trees with fewer topological errors [21] constructed using a state-of-the-art error-correction method. Analyzing these two sets of gene trees separately makes it possible to assess the impact of gene tree error on the prevalence and structure of multiple optimal rootings

Read more

Summary

Introduction

Duplication-Transfer-Loss (DTL) reconciliation is a powerful and increasingly popular technique for studying the evolution of microbial gene families. Even though it is well understood that many gene trees have multiple optimal roots, only a single optimal root is randomly chosen to create the rooted gene tree and perform the reconciliation This remains an important overlooked and unaddressed problem in DTL reconciliation, leading to incorrect evolutionary inferences. Given the evolutionary tree for a gene family, i.e., a gene tree, and the evolutionary tree for the corresponding species, i.e., a species tree, DTL reconciliation compares the gene tree with the species tree and reconciles any differences between the two by proposing gene duplication, horizontal gene transfer, and gene loss events Accurate knowledge of these events and of gene family evolution overall has many important applications throughout biology, and the DTL reconciliation problem has been extensively studied, e.g., [1–13].

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call