Abstract

BackgroundMTML-msBayes uses hierarchical approximate Bayesian computation (HABC) under a coalescent model to infer temporal patterns of divergence and gene flow across codistributed taxon-pairs. Under a model of multiple codistributed taxa that diverge into taxon-pairs with subsequent gene flow or isolation, one can estimate hyper-parameters that quantify the mean and variability in divergence times or test models of migration and isolation. The software uses multi-locus DNA sequence data collected from multiple taxon-pairs and allows variation across taxa in demographic parameters as well as heterogeneity in DNA mutation rates across loci. The method also allows a flexible sampling scheme: different numbers of loci of varying length can be sampled from different taxon-pairs.ResultsSimulation tests reveal increasing power with increasing numbers of loci when attempting to distinguish temporal congruence from incongruence in divergence times across taxon-pairs. These results are robust to DNA mutation rate heterogeneity. Estimating mean divergence times and testing simultaneous divergence was less accurate with migration, but improved if one specified the correct migration model. Simulation validation tests demonstrated that one can detect the correct migration or isolation model with high probability, and that this HABC model testing procedure was greatly improved by incorporating a summary statistic originally developed for this task (Wakeley's ΨW). The method is applied to an empirical data set of three Australian avian taxon-pairs and a result of simultaneous divergence with some subsequent gene flow is inferred.ConclusionsTo retain flexibility and compatibility with existing bioinformatics tools, MTML-msBayes is a pipeline software package consisting of Perl, C and R programs that are executed via the command line. Source code and binaries are available for download at http://msbayes.sourceforge.net/ under an open source license (GNU Public License).

Highlights

  • MTML-msBayes uses hierarchical approximate Bayesian computation (HABC) under a coalescent model to infer temporal patterns of divergence and gene flow across codistributed taxon-pairs

  • We assessed: 1.) the advantage of re-ordering elements of y within distance between each simulated (Dm) by descending magnitude of πb averaged across loci with respect to estimating Ω as a function of number of taxon-pairs (Y) within the sample (Figure 2); 2.) the effect of increasing numbers of loci (1, 4, 8, 16, 32, and 64 loci) when estimating E(τ) and Ω (Figures 3, 4, and 5); 3.) the consequences of allowing for and ignoring rate heterogeneity across loci (Additional file 2); and 4.) how different levels of post-divergence migration influence estimates of E(τ) and Ω and how this is influenced by migration/isolation model misspecification (Figures 6 and 7)

  • Inferring how whole assemblages of species react to putative geographical barriers is central to obtaining these larger goals and MTML-msBayes will become an important bioinformatics tool for such inference given multi-level models with large amounts of complexity

Read more

Summary

Results

To ascertain how well MTML-msBayes quantifies the congruence of divergence times under a number of different conditions, we conduct an extensive simulation analysis by generating PODS (pseudo-observed data sets; [14]) and quantifying the accuracy and precision of estimates on the known parameter values used to generate the PODS. Likewise, when PODS were generated under a migration model, model misspecification resulted in higher estimator bias and less precision as quantified by RMSE and RMSPE Overall, this simulation analysis demonstrates that quantifying the level of temporal congruence in multitaxa divergence will be augmented if one first tests for migration so that an appropriate hyper-prior model can be specified. The posterior for the model indicator parameter conferring to isolation or the two different migration levels (Nm upper bounds of 1.0 and 10.0) was sampled from the 500 closest accepted matches obtained with the ABC algorithm with and without subsequent polychotomous regression The accuracy of this ABC model choice procedure was assessed by conducting this procedure on 100 PODS of five taxon-pairs and 16 loci simulated under each of the three different migration models (isolation and Nm upper bounds of 1.0 and 10.0). Each RMSPE is calculated from the true hyper-parameter value (E(τ) and Ω) and the corresponding 500 accepted points out of a total 1,500,000 draws from the hyper-prior using ABC with local linear regression and a summary statistic vector Dm that only included mean values of πb across loci from every taxon-pair

Conclusions
Background
Mya 0 Mya
32. Beaumont MA
36. Hudson RR
38. Felsenstein J
41. Watterson GA
43. Wakeley J
45. Wakeley J
47. Papoulis A
62. Knowles LL
64. Moyle LC
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call