SIESTA: enhancing searches for optimal supertrees and species trees

Pranjal Vachaspati,Tandy Warnow

doi:10.1186/s12864-018-4621-1

Abstract

BackgroundMany supertree estimation and multi-locus species tree estimation methods compute trees by combining trees on subsets of the species set based on some NP-hard optimization criterion. A recent approach to computing large trees has been to constrain the search space by defining a set of “allowed bipartitions”, and then use dynamic programming to find provably optimal solutions in polynomial time. Several phylogenomic estimation methods, such as ASTRAL, the MDC algorithm in PhyloNet, FastRFS, and ALE, use this approach.ResultsWe present SIESTA, a method that can be combined with these dynamic programming algorithms to return a data structure that compactly represents all the optimal trees in the search space. As a result, SIESTA provides multiple capabilities, including: (1) counting the number of optimal trees, (2) calculating consensus trees, (3) generating a random optimal tree, and (4) annotating branches in a given optimal tree by the proportion of optimal trees it appears in.ConclusionsSIESTA improves the accuracy of FastRFS and ASTRAL, and is a general technique for enhancing dynamic programming methods for constrained optimization.

Highlights

Many supertree estimation and multi-locus species tree estimation methods compute trees by combining trees on subsets of the species set based on some NP-hard optimization criterion
We explore the impact of using SIESTA with two methods that use dynamic programming for constrained exact optimization: the supertree method FastRFS [6] and the incomplete lineage sorting (ILS)-aware species tree estimation method ASTRAL [13]
We show how to correct these support values to take the full set of optimal ASTRAL trees into account, and enable the calculation of a maximum clade credibility (MCC) tree based on these corrected values

Summary

Introduction

Many supertree estimation and multi-locus species tree estimation methods compute trees by combining trees on subsets of the species set based on some NP-hard optimization criterion. Phylogeny estimation is generally approached as a statistical estimation problem, and finding the best tree for a given dataset is typically based on methods that are computationally very intensive; for example, maximum likelihood phylogeny estimation is NP-hard [1] and Bayesian MCMC methods require a long time to converge. For this reason, among others, the calculation of very large phylogenies is often enabled by divide-andconquer methods that use “supertree methods” to combine smaller trees into larger trees. Examples of such “summary methods” (i.e., methods that construct species trees by combining gene trees) that are statistically consistent under the multi-species coalescent model include ASTRAL [12,13,14], GLASS [15], the population tree in BUCKy [16], MP-EST [17], NJst [18], and a modification of NJst called ASTRID [19]

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Genomics	Publication Date: May 1, 2018
Citations: 4	License type: open-access

R Discovery Prime

R Discovery Prime

SIESTA: enhancing searches for optimal supertrees and species trees

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics

Lead the way for us

Similar Papers

Enhancing Searches for Optimal Trees Using SIESTA
Pranjal Vachaspati ... Tandy Warnow
-
Pranjal Vachaspati, et. al.Pranjal Vachaspati ... Tandy Warnow
01 Jan 2017
01 Jan 2017

Totally optimal decision trees for Boolean functions
Igor Chikalov ... Mikhail Moshkov
Discrete Applied Mathematics | VOL. 215
Igor Chikalov, et. al.Igor Chikalov ... Mikhail Moshkov
26 Jul 2016
Discrete Applied Mathematics | VOL. 215

Kruskal's Algorithm for Query Tree Optimization
...
-
, et. al. ...
06 Sep 2007
06 Sep 2007

From Gene Trees to Species Trees
Bin Ma ... Ming Li
SIAM Journal on Computing | VOL. 30
Bin Ma, et. al.Bin Ma ... Ming Li
01 Jan 1999
SIAM Journal on Computing | VOL. 30

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

SIESTA: enhancing searches for optimal supertrees and species trees

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics