Abstract

BackgroundThe nonparametric bootstrap is widely used to measure the branch support of phylogenetic trees. However, bootstrapping is computationally expensive and remains a bottleneck in phylogenetic analyses. Recently, an ultrafast bootstrap approximation (UFBoot) approach was proposed for maximum likelihood analyses. However, such an approach is still missing for maximum parsimony.ResultsTo close this gap we present MPBoot, an adaptation and extension of UFBoot to compute branch supports under the maximum parsimony principle. MPBoot works for both uniform and non-uniform cost matrices. Our analyses on biological DNA and protein showed that under uniform cost matrices, MPBoot runs on average 4.7 (DNA) to 7 times (protein data) (range: 1.2–20.7) faster than the standard parsimony bootstrap implemented in PAUP*; but 1.6 (DNA) to 4.1 times (protein data) slower than the standard bootstrap with a fast search routine in TNT (fast-TNT). However, for non-uniform cost matrices MPBoot is 5 (DNA) to 13 times (protein data) (range:0.3–63.9) faster than fast-TNT. We note that MPBoot achieves better scores more frequently than PAUP* and fast-TNT. However, this effect is less pronounced if an intensive but slower search in TNT is invoked. Moreover, experiments on large-scale simulated data show that while both PAUP* and TNT bootstrap estimates are too conservative, MPBoot bootstrap estimates appear more unbiased.ConclusionsMPBoot provides an efficient alternative to the standard maximum parsimony bootstrap procedure. It shows favorable performance in terms of run time, the capability of finding a maximum parsimony tree, and high bootstrap accuracy on simulated as well as empirical data sets. MPBoot is easy-to-use, open-source and available at http://www.cibiv.at/software/mpboot.

Highlights

  • The nonparametric bootstrap is widely used to measure the branch support of phylogenetic trees

  • We examined the standard bootstrap implemented in PAUP* by applying a randomized stepwise addition followed by full Tree bisection and reconnection (TBR) searches independently on the original as well as bootstrap multiple sequence alignment (MSA)

  • Fast-TNT is faster than MPBoot SPR3 for uniform cost matrix (94.3% DNA and 66.7% protein MSAs) but took substantially much more time than MPBoot SPR3 under non-uniform cost matrix

Read more

Summary

Introduction

The nonparametric bootstrap is widely used to measure the branch support of phylogenetic trees. An ultrafast bootstrap approximation (UFBoot) approach was proposed for maximum likelihood analyses. Such an approach is still missing for maximum parsimony. Phylogenetic inference on empirical data typically includes bootstrapping. This enables the reconstructed tree to be annotated with support values for each of its branches. A bootstrap tree is reconstructed by conducting an Maximum parsimony (MP) is widely used to infer phylogenies ([3] and references therein). Computing the branch support for MP trees is still time consuming especially for large data sets. In addition to run-time limitations, the standard bootstrap is known to be conservative [6]: the support

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call