Weighted bootstrapping: a correction method for assessing the robustness of phylogenetic trees

Vladimir Makarenkov,Pedro Peres-Neto,Pierre Legendre,Alix Boc,Jingxin Xie,François-Joseph Lapointe

doi:10.1186/1471-2148-10-250

Vladimir Makarenkov, Pedro Peres-Neto + Show 4 more

Open Access

PDF Available

https://doi.org/10.1186/1471-2148-10-250

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

BackgroundNon-parametric bootstrapping is a widely-used statistical procedure for assessing confidence of model parameters based on the empirical distribution of the observed data [1] and, as such, it has become a common method for assessing tree confidence in phylogenetics [2]. Traditional non-parametric bootstrapping does not weigh each tree inferred from resampled (i.e., pseudo-replicated) sequences. Hence, the quality of these trees is not taken into account when computing bootstrap scores associated with the clades of the original phylogeny. As a consequence, traditionally, the trees with different bootstrap support or those providing a different fit to the corresponding pseudo-replicated sequences (the fit quality can be expressed through the LS, ML or parsimony score) contribute in the same way to the computation of the bootstrap support of the original phylogeny.ResultsIn this article, we discuss the idea of applying weighted bootstrapping to phylogenetic reconstruction by weighting each phylogeny inferred from resampled sequences. Tree weights can be based either on the least-squares (LS) tree estimate or on the average secondary bootstrap score (SBS) associated with each resampled tree. Secondary bootstrapping consists of the estimation of bootstrap scores of the trees inferred from resampled data. The LS and SBS-based bootstrapping procedures were designed to take into account the quality of each "pseudo-replicated" phylogeny in the final tree estimation. A simulation study was carried out to evaluate the performances of the five weighting strategies which are as follows: LS and SBS-based bootstrapping, LS and SBS-based bootstrapping with data normalization and the traditional unweighted bootstrapping.ConclusionsThe simulations conducted with two real data sets and the five weighting strategies suggest that the SBS-based bootstrapping with the data normalization usually exhibits larger bootstrap scores and a higher robustness compared to the four other competing strategies, including the traditional bootstrapping. The high robustness of the normalized SBS could be particularly useful in situations where observed sequences have been affected by noise or have undergone massive insertion or deletion events. The results provided by the four other strategies were very similar regardless the noise level, thus also demonstrating the stability of the traditional bootstrapping method.

Highlights

Non-parametric bootstrapping is a widely-used statistical procedure for assessing confidence of model parameters based on the empirical distribution of the observed data [1] and, as such, it has become a common method for assessing tree confidence in phylogenetics [2]
The simulations conducted with two real data sets and the five weighting strategies suggest that the secondary bootstrap scores (SBS)-based bootstrapping with the data normalization usually exhibits larger bootstrap scores and a higher robustness compared to the four other competing strategies, including the traditional bootstrapping
Traditional bootstrapping does not take into account either the “tree-likeness” of phylogenies inferred from pseudo-replicated sequences or the bootstrap support of those phylogenies

Summary

Introduction

Non-parametric bootstrapping is a widely-used statistical procedure for assessing confidence of model parameters based on the empirical distribution of the observed data [1] and, as such, it has become a common method for assessing tree confidence in phylogenetics [2]. Traditional non-parametric bootstrapping does not weigh each tree inferred from resampled (i.e., pseudo-replicated) sequences. The quality of these trees is not taken into account when computing bootstrap scores associated with the clades of the original phylogeny. The trees with different bootstrap support or those providing a different fit to the corresponding pseudo-replicated sequences (the fit quality can be expressed through the LS, ML or parsimony score) contribute in the same way to the computation of the bootstrap support of the original phylogeny. Non-parametric bootstrapping proceeds by generating pseudo-replicates of the observed data. The frequency with which a given branch is found represents its bootstrap support (i.e., bootstrap score)

Methods

Results

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Evolutionary Biology	Publication Date: Jan 1, 2010
Citations: 11	License type: cc-by

R Discovery Prime

Weighted bootstrapping: a correction method for assessing the robustness of phylogenetic trees

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: BMC Evolutionary Biology

Lead the way for us

Similar Papers

The performance of LS and SVD methods for SBAS InSAR deformation model solutions
Qiuxiang Tao ... Tongwen Liu
International Journal of Remote Sensing | VOL. 41
Qiuxiang Tao, et. al.Qiuxiang Tao ... Tongwen Liu
04 Sep 2020
International Journal of Remote Sensing | VOL. 41

Two new data-dependent choices of m when applying the m-out-of-n bootstrap to hypothesis testing
James S Allison ... Jan W.H Swanepoel
Journal of Statistical Computation and Simulation | VOL. 81
James S Allison, et. al.James S Allison ... Jan W.H Swanepoel
01 Dec 2011
Journal of Statistical Computation and Simulation | VOL. 81

Double fused Lasso penalized LAD for matrix regression
Mei Li ... Lingchen Kong
Applied Mathematics and Computation | VOL. 357
Mei Li, et. al.Mei Li ... Lingchen Kong
09 Apr 2019
Applied Mathematics and Computation | VOL. 357

Large Measurement Regression: Hierarchical Least Squares Multisplitting
Gilles Inghelbrecht ... Kurt Barbe
-
Gilles Inghelbrecht, et. al.Gilles Inghelbrecht ... Kurt Barbe
01 May 2019
01 May 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Weighted bootstrapping: a correction method for assessing the robustness of phylogenetic trees

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: BMC Evolutionary Biology