Identifying consistent allele frequency differences in studies of stratified populations.

R Axel W Wiberg,Michael B Morrissey,Oscar E Gaggiotti,Michael G Ritchie

doi:10.1111/2041-210x.12810

R Axel W Wiberg, Michael B Morrissey + Show 2 more

Open Access

https://doi.org/10.1111/2041-210x.12810

Copy DOI

Journal: Methods in Ecology and Evolution	Publication Date: Jun 15, 2017
Citations: 49	License type: CC BY 4.0

Affiliation: University of St Andrews

Abstract

With increasing application of pooled‐sequencing approaches to population genomics robust methods are needed to accurately quantify allele frequency differences between populations. Identifying consistent differences across stratified populations can allow us to detect genomic regions under selection and that differ between populations with different histories or attributes. Current popular statistical tests are easily implemented in widely available software tools which make them simple for researchers to apply. However, there are potential problems with the way such tests are used, which means that underlying assumptions about the data are frequently violated.These problems are highlighted by simulation of simple but realistic population genetic models of neutral evolution and the performance of different tests are assessed. We present alternative tests (including Generalised Linear Models [GLMs] with quasibinomial error structure) with attractive properties for the analysis of allele frequency differences and re‐analyse a published dataset.The simulations show that common statistical tests for consistent allele frequency differences perform poorly, with high false positive rates. Applying tests that do not confound heterogeneity and main effects significantly improves inference. Variation in sequencing coverage likely produces many false positives and re‐scaling allele frequencies to counts out of a common value or an effective sample size reduces this effect.Many researchers are interested in identifying allele frequencies that vary consistently across replicates to identify loci underlying phenotypic responses to selection or natural variation in phenotypes. Popular methods that have been suggested for this task perform poorly in simulations. Overall, quasibinomial GLMs perform better and also have the attractive feature of allowing correction for multiple testing by standard procedures and are easily extended to other designs.

Highlights

With the increasing application of pooled genome sequencing approaches to population genomics (Boitard, Schlo, Nolte, Pandey, & Futschik, 2012; Ferretti, Ramos-Onsins, & Pérez-Enciso, 2013; Schlötterer, Kofler, Versace, Tobler, & Franssen, 2015; Schlötterer, Tobler, Kofler, & Nolte, 2014) researchers are interested in accurately quantifying allele frequency differences between populations and using these to infer the action of selection
The aim is usually to determine whether the frequencies of an allele at a particular marker consistently differ between subsets of a population or whether such differences are consistent across replicated experimental evolution lines
Very little attention has been paid to pseudoreplication of allele counts that is inherent in pool-seq experimental designs. We show that these violations of statistical assumptions produce high false discovery rates (FDRs)

Summary

Introduction

With the increasing application of pooled genome sequencing (pool- seq) approaches to population genomics (Boitard, Schlo, Nolte, Pandey, & Futschik, 2012; Ferretti, Ramos-Onsins, & Pérez-Enciso, 2013; Schlötterer, Kofler, Versace, Tobler, & Franssen, 2015; Schlötterer, Tobler, Kofler, & Nolte, 2014) researchers are interested in accurately quantifying allele frequency differences between populations and using these to infer the action of selection Such data can provide us with insights into the evolutionary and demographic history of populations and to identify regions under selection and alleles that consistently differ in frequency between population substrata with different characteristics, across populations. Markers that show a consistent difference across replicates are more likely to be functionally important in producing the phenotype under study

Objectives

Methods

Findings

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Identifying consistent allele frequency differences in studies of stratified populations.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Methods in Ecology and Evolution

Lead the way for us

Similar Papers

Address of the President, Lord Rees of Ludlow OM Kt PRS, given at the anniversary meeting on 30 November 2007
-
Notes and Records of the Royal Society | VOL. 62
--
10 Mar 2008
Address of the President, Lord Rees of Ludlow OM Kt PRS, given at the anniversary meeting on 30 November 2007
-

Naturalia : the history of natural history and medicine in the seventeenth century
Anna Marie Roos
Notes and Records of the Royal Society | VOL. 66
Anna Marie RoosAnna Marie Roos
17 Oct 2012
Notes and Records of the Royal Society | VOL. 66

Genotypic and allelic variability in CYP19A1 among populations of African and European ancestry.
Athena Starlard-Davenport ... Mohammed S Orloff
PloS one | VOL. 10
Athena Starlard-Davenport, et. al.Athena Starlard-Davenport ... Mohammed S Orloff
03 Feb 2015
PloS one | VOL. 10

Mucus: aiding elasmobranch conservation through non-invasive genetic sampling
L Lieber ... J Hall
Endangered Species Research | VOL. 21
L Lieber, et. al.L Lieber ... J Hall
06 Sep 2013
Endangered Species Research | VOL. 21

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Identifying consistent allele frequency differences in studies of stratified populations.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Methods in Ecology and Evolution