Abstract

ABSTRACTThe sample frequency spectrum (SFS), or histogram of allele counts, is an important summary statistic in evolutionary biology, and is often used to infer the history of population size changes, migrations, and other demographic events affecting a set of populations. The expected multipopulation SFS under a given demographic model can be efficiently computed when the populations in the model are related by a tree, scaling to hundreds of populations. Admixture, back-migration, and introgression are common natural processes that violate the assumption of a tree-like population history, however, and until now the expected SFS could be computed for only a handful of populations when the demographic history is not a tree. In this article, we present a new method for efficiently computing the expected SFS and linear functionals of it, for demographies described by general directed acyclic graphs. This method can scale to more populations than previously possible for complex demographic histories including admixture. We apply our method to an 8-population SFS to estimate the timing and strength of a proposed “basal Eurasian” admixture event in human history. We implement and release our method in a new open-source software package momi2.

Highlights

  • All natural populations undergo evolutionary processes of migration, size changes, and divergence, and the history of these demographic events shape their present genetic diversity

  • Inferring demographic history is of central concern in evolutionary and population genetics, both for its intrinsic interest (e.g., in dating the out-of-Africa migration of modern humans (Schaffner et al, 2005; Gutenkunst et al, 2009)) and for biological applications (such as distinguishing the effects of natural selection from demography (Beaumont and Nichols, 1996; Boyko et al, 2008))

  • The joint sample frequency spectrum (SFS) is the multidimensional histogram of mutant allele counts in a sample of DNA sequences, and is a popular summary statistic which lies at the core of hundreds of

Read more

Summary

Introduction

All natural populations undergo evolutionary processes of migration, size changes, and divergence, and the history of these demographic events shape their present genetic diversity. The expected SFS can be efficiently computed when the demographic history is a tree, and in previous work we developed a method momi to compute the SFS of hundreds of populations related by a tree (Kamm et al, 2017). Natural populations are often related by a more complex history that is not tree-like, as gene flow (the exchange of migrants between populations) adds extra edges to the topology associated with the demographic history In this case, computing the expected SFS is much more computationally demanding, and existing methods for computing the exact expected SFS can scale to only a handful of populations (Gutenkunst et al, 2009; Jouganous et al, 2017). The Appendix contains all proofs, an analysis of the computational complexity of our method, and additional details of the application to ancient DNA

Background
Demographic events
Likelihoods and the site frequency spectrum
Existing work and our contribution
Method
Example
Algorithms and formulas
Normalizing constant and other linear functionals
Application
Model and notation
Proofs
Proof of Theorem 1
Proof of Lemma 5
Proof of Lemma 1
Proof of Lemma 2
Proof of Lemma 4
Proof of Theorem 2
Computational complexity
Application supplement
Model fitting procedure
Findings
Mutation rate estimation
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.