Abstract

BackgroundIn phylogenetic analysis, it is common to infer unrooted trees. However, knowing the root location is desirable for downstream analyses and interpretation. There exist several methods to recover a root, such as molecular clock analysis (including midpoint rooting) or rooting the tree using an outgroup. Non-reversible Markov models can also be used to compute the likelihood of a potential root position.ResultsWe present a software called RootDigger which uses a non-reversible Markov model to compute the most likely root location on a given tree and to infer a confidence value for each possible root placement. We find that RootDigger is successful at finding roots when compared to similar tools such as IQ-TREE and MAD, and will occasionally outperform them. Additionally, we find that the exhaustive mode of RootDigger is useful in quantifying and explaining uncertainty in rooting positions.ConclusionsRootDigger can be used on an existing phylogeny to find a root, or to asses the uncertainty of the root placement. RootDigger is available under the MIT licence at https://www.github.com/computations/root_digger.

Highlights

  • In phylogenetic analysis, it is common to infer unrooted trees

  • For simulations and empirical data, we computed the topological distance from the estimated root to the true root, and normalized it by the number of nodes in the tree

  • In Huelsenbeck [16], it was shown that the prior probability of a root placement on a sample tree did not have a strong signal when using a non-reversible model of character substitution

Read more

Summary

Background

Most tools [1, 2] yield unrooted trees. This is because they typically implement time-reversible nucleotide substitution models [3] as they yield the phylogenetic inference problem computationally tractable. To root a tree when the primary phylogenetic inference is performed via a reversible model, researchers typically deploy one of the two following methods: including a set of outgroup taxa in the analysis, or using some form of molecular clock analysis. The one primarily used in this work, is to eliminate the reversibility assumption of standard character (e.g., nucleotide or amino acid) substitution models Eliminating this assumption significantly increases the computational effort required to find a good (high likelihood) phylogenetic tree. RootDigger uses the tree and branch lengths to find the most likely root location by calculating the likelihood of a root location under a non-reversible model of DNA1 substitution (UNREST [23] with a user specified number of Ŵ discrete rate categories, and an optional proportion of invariant sites, i.e., UNREST+Ŵ+I).

Find the best root location for the current model
Optimize model parameters
Report the tree with annotations for every branch:
Results
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call