Abstract

The legend for Figure 6 contained inaccuracies. The correct Figure 6 legend should read: Integrating over branch length uncertainty causes misinterpretation of convergence as phylogenetic signal. We estimated the likelihood surface over branch lengths for datasets with expected character state pattern frequencies on a four-taxon star tree with long branches (0.75 substitutions/site) to termini A and C and short branches (0.05) to termini B and D. a, For each branch length, the likelihood is plotted for each of the three resolved trees, with the other lengths fixed at their ML values. Vertical dotted lines indicate the true branch lengths used to generate data. Likelihood functions are shown for expected datasets of N = 10,000 (top) and 100,000 (bottom). In both cases, the area under the curve for the long-branch attraction topology (red) exceeds that for the other topologies (blue and green, which are identical). b, The partial posterior probability of each resolved topology is shown for each character state pattern when branch lengths are integrated over (top) or fixed at their estimated values (bottom). Character state patterns are indicated using variables representing nucleotides of the same type: for example, pattern xyxy stands for the realizations ACAC, AGAG, ATAT, CACA,…TGTG. Results are shown for the expected 10,000-nt dataset. c, The log likelihood ratio of the long branch attraction tree (AC) to the AB tree is shown (left panel) for expected data of increasing sequence length generated on the star phylogeny. Right panel, corresponding posterior probability of each tree topology. Filled circles, BI; crosses, ML. d, Support for each topology is shown under similar conditions for a resolved true tree (internal branch length 0.001).

Highlights

  • Statistical inference of phylogenetic relationships informs analysis in fields as diverse as comparative genomics, epidemiology, ecology, and evolution [1]

  • Bayesian inference (BI)—using the common assumption of uniform priors over branch lengths—inferred as the maximum a posteriori tree the falsely resolved topology that pairs long branches together from over 70% of replicates, with mean posterior probability,0.6, when sequences were of moderate length

  • This long branch attraction (LBA) bias grew stronger with increasing sequence length, as indicated by a positive slope of the best-fit regression curve (P = 0.03)

Read more

Summary

Introduction

Statistical inference of phylogenetic relationships informs analysis in fields as diverse as comparative genomics, epidemiology, ecology, and evolution [1]. BI and its precursor maximum likelihood (ML) infer phylogenetic relationships using the same probabilistic models of molecular evolution, so it has been assumed that BI, like ML [7,8,9], is largely unbiased and statistically consistent given the correct model [6,10]. A key difference between BI and ML—and a major proposed advantage of BI [3,10,11,12]—is that Bayesian methods incorporate uncertainty about ‘‘nuisance parameters’’ such as branch lengths on the topology and the parameters of the evolutionary model; in contrast, ML requires specific values for these parameters to be estimated from the data. Because BI incorporates uncertainty about nuisance parameters, it has been favored over ML for implementing complex models with many parameters, when data are limited [3,10,12,14]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.