Abstract

Approximate Bayesian Computation (ABC) has become a popular technique in evolutionary genetics for elucidating population structure and history due to its flexibility. The statistical inference framework has benefited from significant progress in recent years. In population genetics, however, its outcome depends heavily on the amount of information in the dataset, whether that be the level of genetic variation or the number of samples and loci. Here we look at the power to reject a simple constant population size coalescent model in favor of a bottleneck model in datasets of varying quality. Not only is this power dependent on the number of samples and loci, but it also depends strongly on the level of nucleotide diversity in the observed dataset. Whilst overall model choice in an ABC setting is fairly powerful and quite conservative with regard to false positives, detecting weaker bottlenecks is problematic in smaller or less genetically diverse datasets and limits the inferences possible in non-model organism where the amount of information regarding the two models is often limited. Our results show it is important to consider these limitations when performing an ABC analysis and that studies should perform simulations based on the size and nature of the dataset in order to fully assess the power of the study.

Highlights

  • Central to evolutionary biology and science in general is the need to quantitatively compare models and hypotheses

  • The mean of some statistics, such as Tajima’s D, Fay & Wu’s H and the site frequency spectrum appear independent of h, whereas the mean of He shows a strong correlation, reflecting an increase in the number of haplotypes with an increase in recombination

  • We began by first exploring the behaviour of summary statistics in a bottleneck model, and proceeded by investigating the power that different sets of summary statistics have in separating a bottleneck model from a simple model of constant effective population size

Read more

Summary

Introduction

Central to evolutionary biology and science in general is the need to quantitatively compare models and hypotheses. In population genetics estimating parameters from more complex, biologically realistic models often involves a likelihood function that is difficult to compute. This has led to the development of methods, such as Approximate Bayesian Computation (ABC; [1]), that aim to approximate the likelihood function by simulating under a given model and using summary statistics to capture key aspects of the data in the most informative way (see [2] for an historical overview). Due to the flexibility and efficiency of ABC it is possible to compare and estimate parameters from a number of complex models, and this has led to the widespread adoption of the method within the population genetics community for assessing and fitting demographic models to molecular data. Methods for estimating demographic histories have become increasingly important, and have fuelled the proliferation of studies using ABC to infer a suitable demographic model

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.