Selective sweeps, resulting from the spread of beneficial, neutral, or deleterious mutations through a population, shape patterns of genetic variation at linked neutral sites. While many theoretical, computational, and statistical advances have been made in understanding the genomic signatures of selective sweeps in recombining populations, substantially less is understood in populations with little/no recombination. We present a mathematical framework based on diffusion theory for obtaining the site frequency spectrum (SFS) at linked neutral sites immediately post and during the fixation of moderately or strongly beneficial mutations. We find that when a single hard sweep occurs, the SFS decays as 1/ x for low derived allele frequencies ( x ), similar to the neutral SFS at equilibrium, whereas at higher derived allele frequencies, it follows a 1/ x 2 power law. These power laws are universal in the sense that they are independent of the dominance and inbreeding coefficient, and also characterize the SFS during the sweep. Additionally, we find that the derived allele frequency where the SFS shifts from the 1/ x to 1/ x 2 law, is inversely proportional to the selection strength: thus under strong selection, the SFS follows the 1/ x 2 dependence for most allele frequencies, resembling a rapidly expanding neutral population. When clonal interference is pervasive, the SFS immediately post-fixation becomes U-shaped and is better explained by the equilibrium SFS of selected sites. Our results will be important in developing statistical methods to infer the timing and strength of recent selective sweeps in asexual populations, genomic regions that lack recombination, and clonally propagating tumor populations.
Read full abstract