Abstract

BackgroundInferring species trees from gene trees using the coalescent-based summary methods has been the subject of much attention, yet new scalable and accurate methods are needed.ResultsWe introduce DISTIQUE, a new statistically consistent summary method for inferring species trees from gene trees under the coalescent model. We generalize our results to arbitrary phylogenetic inference problems; we show that two arbitrarily chosen leaves, called anchors, can be used to estimate relative distances between all other pairs of leaves by inferring relevant quartet trees. This results in a family of distance-based tree inference methods, with running times ranging between quadratic to quartic in the number of leaves.ConclusionsWe show in simulated studies that DISTIQUE has comparable accuracy to leading coalescent-based summary methods and reduced running times.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-016-3098-z) contains supplementary material, which is available to authorized users.

Highlights

  • Inferring species trees from gene trees using the coalescent-based summary methods has been the subject of much attention, yet new scalable and accurate methods are needed

  • When consensus is used within DISTIQUE, the accuracy improves with decreased Incomplete Lineage Sorting (ILS), as expected (Additional file 1: Figures S1 and S2)

  • Hereafter, we only show results for DISTIQUE applied to a majority consensus, and we omit all-pairs-max

Read more

Summary

Introduction

Inferring species trees from gene trees using the coalescent-based summary methods has been the subject of much attention, yet new scalable and accurate methods are needed. A desirable property for a summary method is statistical consistency (a theoretical guarantee that it converges in probability to the correct species tree as the number of error-free genes increases). Many statistically consistent summary methods are available (e.g., ASTRAL [3, 4], BUCKy-population [5], and MPEST [6]), and coalescent-based species tree estimation is a vibrant field of research, with many recent examples of successful biological analyses [7,8,9] (see [10,11,12,13,14] for criticism of these methods, especially their sensitivity to gene tree error)

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.