Abstract

Large-scale phylogenies provide a valuable source to study background diversification rates and investigate if the rates have changed over time. Unfortunately most large-scale, dated phylogenies are sparsely sampled (fewer than 5% of the described species) and taxon sampling is not uniform. Instead, taxa are frequently sampled to obtain at least one representative per subgroup (e.g. family) and thus to maximize diversity (diversified sampling). So far, such complications have been ignored, potentially biasing the conclusions that have been reached. In this study I derive the likelihood of a birth-death process with non-constant (time-dependent) diversification rates and diversified taxon sampling. Using simulations I test if the true parameters and the sampling method can be recovered when the trees are small or medium sized (fewer than 200 taxa). The results show that the diversification rates can be inferred and the estimates are unbiased for large trees but are biased for small trees (fewer than 50 taxa). Furthermore, model selection by means of Akaike's Information Criterion favors the true model if the true rates differ sufficiently from alternative models (e.g. the birth-death model is recovered if the extinction rate is large and compared to a pure-birth model). Finally, I applied six different diversification rate models – ranging from a constant-rate pure birth process to a decreasing speciation rate birth-death process but excluding any rate shift models – on three large-scale empirical phylogenies (ants, mammals and snakes with respectively 149, 164 and 41 sampled species). All three phylogenies were constructed by diversified taxon sampling, as stated by the authors. However only the snake phylogeny supported diversified taxon sampling. Moreover, a parametric bootstrap test revealed that none of the tested models provided a good fit to the observed data. The model assumptions, such as homogeneous rates across species or no rate shifts, appear to be violated.

Highlights

  • Patterns of biodiversity reflected in phylogenetic estimates indicate that (1) rates of diversification are not constant over time or across the tree and (2) taxonomic sampling is both incomplete and non-random

  • It is well known how to accommodate uniform taxon sampling, where every taxon has the same probability to be included in the dataset, in inference based on the birth-death process [6,10]

  • Even under a constant-rate pure birth process the Maximum Likelihood Estimation (MLE) was biased for trees with fewer than 50 taxa compared with the results of Morlon et al who found no bias, see Figure S4 in [8]

Read more

Summary

Introduction

Patterns of biodiversity reflected in phylogenetic estimates indicate that (1) rates of diversification are not constant over time or across the tree and (2) taxonomic sampling is both incomplete and non-random. Taxa are often selected so that the diversity is maximized, e.g. sampling at least one species per family [4,5]. This strategy is called diversified sampling [2]. The birth-death process with uniform taxon sampling has been extended to time-dependent rates [8] and diversity-dependent rates [11]. Diversified taxon sampling has only been considered in the context of constant rates [2] and, to my knowledge, the corresponding likelihood functions for non-constant rates have not been available previously

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call