Abstract

Current procedures for inferring population history generally assume complete neutrality—that is, they neglect both direct selection and the effects of selection on linked sites. We here examine how the presence of direct purifying selection and background selection may bias demographic inference by evaluating two commonly-used methods (MSMC and fastsimcoal2), specifically studying how the underlying shape of the distribution of fitness effects and the fraction of directly selected sites interact with demographic parameter estimation. The results show that, even after masking functional genomic regions, background selection may cause the mis-inference of population growth under models of both constant population size and decline. This effect is amplified as the strength of purifying selection and the density of directly selected sites increases, as indicated by the distortion of the site frequency spectrum and levels of nucleotide diversity at linked neutral sites. We also show how simulated changes in background selection effects caused by population size changes can be predicted analytically. We propose a potential method for correcting for the mis-inference of population growth caused by selection. By treating the distribution of fitness effect as a nuisance parameter and averaging across all potential realizations, we demonstrate that even directly selected sites can be used to infer demographic histories with reasonable accuracy.

Highlights

  • The characterization of past population size change is a central goal of population genomic analysis, with applications ranging from anthropological to agricultural to clinical

  • When two or four diploid genomes were used for inference, multiple sequentially Markovian coalescent (MSMC) again inferred a recent many-fold growth for all segment sizes even when the true model was equilibrium, but performed well when using one diploid genome with large segments

  • Extra caution should be used when interpreting population size changes inferred by MSMC when using more than one diploid individual

Read more

Summary

Introduction

The characterization of past population size change is a central goal of population genomic analysis, with applications ranging from anthropological to agricultural to clinical (see review by Beichman et al 2018). There is evidence for weak direct selection on all of these categories in multiple organisms (e.g., Andolfatto 2005; Chamary and Hurst 2005; Haddrill et al 2005; Lynch 2007; Zeng and Charlesworth 2010; Choi and Aquadro 2016; Jackson et al 2017), it is clear that such sites near or in coding regions will experience background selection (BGS; Charlesworth et al 1993; Charlesworth 2013), and may periodically be affected by selective sweeps as well (Messer and Petrov 2013; Schrider et al 2016) These effects are known to affect the local underlying effective population size, and alter both the levels and patterns of variation and linkage disequilibrium (Charlesworth et al 1993; Kaiser and Charlesworth 2009; O’Fallon et al 2010; Charlesworth 2013; Nicolaisen and Desai 2013; Ewing and Jensen 2016; Johri et al 2020)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call