Abstract

BackgroundEvolutionary histories can be discordant across the genome, and such discordances need to be considered in reconstructing the species phylogeny. ASTRAL is one of the leading methods for inferring species trees from gene trees while accounting for gene tree discordance. ASTRAL uses dynamic programming to search for the tree that shares the maximum number of quartet topologies with input gene trees, restricting itself to a predefined set of bipartitions.ResultsWe introduce ASTRAL-III, which substantially improves the running time of ASTRAL-II and guarantees polynomial running time as a function of both the number of species (n) and the number of genes (k). ASTRAL-III limits the bipartition constraint set (X) to grow at most linearly with n and k. Moreover, it handles polytomies more efficiently than ASTRAL-II, exploits similarities between gene trees better, and uses several techniques to avoid searching parts of the search space that are mathematically guaranteed not to include the optimal tree. The asymptotic running time of ASTRAL-III in the presence of polytomies is Oleft ((nk)^{1.726} D right) where D=O(nk) is the sum of degrees of all unique nodes in input trees. The running time improvements enable us to test whether contracting low support branches in gene trees improves the accuracy by reducing noise. In extensive simulations, we show that removing branches with very low support (e.g., below 10%) improves accuracy while overly aggressive filtering is harmful. We observe on a biological avian phylogenomic dataset of 14K genes that contracting low support branches greatly improve results.ConclusionsASTRAL-III is a faster version of the ASTRAL method for phylogenetic reconstruction and can scale up to 10,000 species. With ASTRAL-III, low support branches can be removed, resulting in improved accuracy.

Highlights

  • Evolutionary histories can be discordant across the genome, and such discordances need to be considered in reconstructing the species phylogeny

  • We introduce an improved version of ASTRAL called ASTRAL-III

  • Experimental setup We study three research questions: RQ1: Can contracting low support branches improve the accuracy of ASTRAL?

Read more

Summary

Introduction

Evolutionary histories can be discordant across the genome, and such discordances need to be considered in reconstructing the species phylogeny. The potential for genome-wide discordance of evolutionary histories [1, 2] has motivated the development of several approaches for species phylogeny reconstruction. (while “gene trees” need not be inferred from functional genes, following the conventions of the field, we will refer to them as such). This two-step approach stands in contrast to concatenation [8], where all the data are combined and analyzed in a single analysis. ILS is typically modeled by the multi-species coalescent model (MSCM) [15, 16], where branches of the species tree represent populations, and lineages are

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call