Abstract

In two-step coalescent analyses of phylogenomic data, gene-tree topologies are treated as fixed prior to species-tree inference. Although all gene-tree conflict is assumed to be caused by lineage sorting when applying these methods, in empirical datasets much of the conflict can be caused by estimation error. Weakly supported and even arbitrarily resolved clades are important sources of this estimation error for gene trees inferred from few informative characters relative to the number of sampled terminals, and the resulting extraneous conflict among gene trees can negatively impact species-tree inference. In this study, we quantified the relative severity of alternative methods for collapsing gene-tree branches for seven empirical datasets and quantified their effects on species-tree inference. The branch-collapsing methods that we employed were based on the strict consensus of optimal topologies, various bootstrap thresholds, and 0% approximate likelihood ratio test (SH-like aLRT) support. Up to 86% of internal gene-tree branches are dubiously or arbitrarily resolved in reanalyses of these published phylogenomic datasets, and collapsing these branches increased inferred species-tree coalescent branch lengths by up to 455%. For two datasets, the longer inferred branch lengths sometimes impacted inference of anomaly-zone conditions. Although branch-collapsing methods did not consistently affect the species-tree topology, they often increased branch support. The more severe and clearly justified gene-tree branch-collapsing methods, which we recommend be broadly applied for two-step coalescent analyses, are use of the strict consensus in parsimony analyses and the collapse clades with 0% SH-like aLRT support in likelihood analyses. Collapsing dubiously or arbitrarily resolved branches in gene trees sometimes improved congruence between coalescent-based results and concatenation trees. In such cases, we contend that the resolution provided by concatenation should be preferred and that incomplete lineage sorting is a poor explanation for the initial conflict between phylogenetic approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call