Abstract

Consensus trees are required to summarize trees obtained through MCMC sampling of a posterior distribution, providing an overview of the distribution of estimated parameters such as topology, branch lengths, and divergence times. Numerous consensus tree construction methods are available, each presenting a different interpretation of the tree sample. The rise of morphological clock and sampled-ancestor methods of divergence time estimation, in which times and topology are coestimated, has increased the popularity of the maximum clade credibility (MCC) consensus tree method. The MCC method assumes that the sampled, fully resolved topology with the highest clade credibility is an adequate summary of the most probable clades, with parameter estimates from compatible sampled trees used to obtain the marginal distributions of parameters such as clade ages and branch lengths. Using both simulated and empirical data, we demonstrate that MCC trees, and trees constructed using the similar maximum a posteriori (MAP) method, often include poorly supported and incorrect clades when summarizing diffuse posterior samples of trees. We demonstrate that the paucity of information in morphological data sets contributes to the inability of MCC and MAP trees to accurately summarise of the posterior distribution. Conversely, majority-rule consensus (MRC) trees represent a lower proportion of incorrect nodes when summarizing the same posterior samples of trees. Thus, we advocate the use of MRC trees, in place of MCC or MAP trees, in attempts to summarize the results of Bayesian phylogenetic analyses of morphological data.

Highlights

  • Several methods are available to summarize the results from Bayesian posterior tree samples

  • Maximum Clade Credibility (MCC) trees recovered from 100-character data sets possessed the most correct nodes in absolute terms, with majority-rule consensus (MRC) and maximum a posteriori tree (MAP) trees possessing a similar number of correct clades to one another (Fig. 2)

  • When the number of incorrect nodes is subtracted from the number of correct nodes presented in each consensus tree, the MRC tree often exhibited a greater total level of accuracy than MCC trees, which in turn often exhibited a higher level of accuracy than MAP trees (Fig. 2)

Read more

Summary

Introduction

Several methods are available to summarize the results from Bayesian posterior tree samples. To obtain the MAP tree, the MCMC sampling procedure must be performed for an inordinate amount of time as the goal is no longer to approximate the posterior distribution but, instead, to inefficiently find the tree with the greatest posterior probability Another sampled tree consensus method, Maximum Clade Credibility (MCC), is less susceptible to this source of error as it considers the distribution of clade support in the posterior sample of trees. MRC trees present divergence times on a set of well-supported (posterior probability >0.5) bifurcations, or soft polytomies, in the presence of uncertainty Such a conservative approach to presenting topological uncertainty may be desirable, in a Bayesian framework in which obtaining the marginal posterior distribution of model parameters results in the explicit estimation of their uncertainty. By analyzing several empirical data matrices that are expected to possess differing amounts of observed information about the same set of divergences, we demonstrate that both the MCC and MAP methods are likely to be inappropriate when summarizing posterior samples of trees obtained from empirical morphological data as consensus trees

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call