Abstract

A composite likelihood ratio test implemented in the program sweepfinder is a commonly used method for scanning a genome for recent selective sweeps. sweepfinder uses information on the spatial pattern (along the chromosome) of the site frequency spectrum around the selected locus. To avoid confounding effects of background selection and variation in the mutation process along the genome, the method is typically applied only to sites that are variable within species. However, the power to detect and localize selective sweeps can be greatly improved if invariable sites are also included in the analysis. In the spirit of a Hudson–Kreitman–Aguadé test, we suggest adding fixed differences relative to an out‐group to account for variation in mutation rate, thereby facilitating more robust and powerful analyses. We also develop a method for including background selection, modelled as a local reduction in the effective population size. Using simulations, we show that these advances lead to a gain in power while maintaining robustness to mutation rate variation. Furthermore, the new method also provides more precise localization of the causative mutation than methods using the spatial pattern of segregating sites alone.

Highlights

  • Nielsen et al (2005) argued that the use of the overall genomic site frequency spectrum (SFS) to represent the neutral case leads to increased robustness, and showed that the method was robust to a two-epoch growth model and an isolation–migration model with population growth in both populations, with parameters estimated from human single nucleotide polymorphism (SNP) data (Marth et al 2004)

  • The root-mean-square error (RMSE) of the estimated location of the sweep increases for older sweeps (Fig. 3b)

  • We evaluated the performance of a composite likelihood ratio test for detecting selective sweeps (Nielsen et al 2005) when including fixed differences in the likelihood ratio in addition to SFS information, using extensive simulations

Read more

Summary

Introduction

Rapid advances in sequencing technology during the past few years have facilitated studies using genomewide molecular data for detecting signatures of selective sweeps (Akey et al 2002; Carlson et al 2005; Kelley et al 2006; Voight et al 2006; Wang et al 2006; Kimura et al 2007; Sabeti et al 2007; Tang et al 2007; Williamson et al 2007; Xia et al 2009; Qanbari et al 2012; Chavez-Galarza et al 2013; Long et al 2013; Ramey et al 2013; Huber et al 2014), and a large number of compu-In this study, we are solely concerned with the model of a classical hard selective sweep in a single population, and we assume that the beneficial mutation has reached fixation not too long ago. Kim & Stephan (2002) proposed a composite likelihood ratio statistic based on calculating the product of marginal likelihood functions for all sites on a chromosome under models with and without a selective sweep at a particular position, and under the assumption of a panmictic population of constant size. The resulting composite likelihood ratio is computed for each position of interest to evaluate the evidence for a sweep at those positions This method, does incorporate information regarding the SFS, but does so in a way that uses the spatial distribution (along the chromosome) at segregating alleles of different frequencies. The distribution of the SFS under the alternative hypothesis of selection is derived by considering the way a selective sweep would modify the observed background distribution of allele frequencies This leads to a computationally fast method, facilitating genomewide analyses. It has become clear that, while this method may be more robust than some previous SFS-based approaches, it can produce a high proportion of false positives if there has been a strong recent bottleneck in population size, but a standard neutral model is used to calculate critical values (Jensen et al 2005; Pavlidis et al 2008)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call