Empirical Validation of Pooled Whole Genome Population Re-Sequencing in Drosophila melanogaster

Yuan Zhu,Josefa González,Dmitri A Petrov,Alan O Bergland

doi:10.1371/journal.pone.0041901

Abstract

The sequencing of pooled non-barcoded individuals is an inexpensive and efficient means of assessing genome-wide population allele frequencies, yet its accuracy has not been thoroughly tested. We assessed the accuracy of this approach on whole, complex eukaryotic genomes by resequencing pools of largely isogenic, individually sequenced Drosophila melanogaster strains. We called SNPs in the pooled data and estimated false positive and false negative rates using the SNPs called in individual strain as a reference. We also estimated allele frequency of the SNPs using “pooled” data and compared them with “true” frequencies taken from the estimates in the individual strains. We demonstrate that pooled sequencing provides a faithful estimate of population allele frequency with the error well approximated by binomial sampling, and is a reliable means of novel SNP discovery with low false positive rates. However, a sufficient number of strains should be used in the pooling because variation in the amount of DNA derived from individual strains is a substantial source of noise when the number of pooled strains is low. Our results and analysis confirm that pooled sequencing is a very powerful and cost-effective technique for assessing of patterns of sequence variation in populations on genome-wide scales, and is applicable to any dataset where sequencing individuals or individual cells is impossible, difficult, time consuming, or expensive.

Highlights

Efficient assessment of presence and frequencies of singlenucleotide polymorphisms (SNP) in populations is vital to answering key problems in genetics and population biology
We investigated the effects of mapping strategies, read depth, unequal DNA contribution, and reproducibility of the technique with regards to the accuracy of population allele frequency estimation from pooled sequencing
We looked at DGRP SNP positions that were covered to . = 106 read depth in library A and compared DGRP allele frequency estimates to those from library A

Summary

Introduction

Efficient assessment of presence and frequencies of singlenucleotide polymorphisms (SNP) in populations is vital to answering key problems in genetics and population biology. Inference of demographic history, identification of causative loci affecting a trait of interest, discovery of cancercausing mutations in mixed pools of cells, or the search for evidence of natural selection in the genome all require knowledge of the frequency spectra in groups of individuals or cells. Individually sequencing dozens of individuals from each population is often more costly and labor intensive. Multiplexing techniques allow a more efficient use of sequencing resources but still require a large number of individual DNA extractions, manipulations of reagents, barcoding oligos, PCR reactions, and sequencing library constructions. Pooling individuals prior to DNA extraction and sequencing the pooled DNA without barcodes can generate an inexpensive and efficient assessment of allele frequencies genome-wide

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLoS ONE	Publication Date: Jul 26, 2012
Citations: 119	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Empirical Validation of Pooled Whole Genome Population Re-Sequencing in Drosophila melanogaster

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE

Lead the way for us

Similar Papers

On the impact of model selection on predictor identification and parameter inference
Ruth M Pfeiffer ... Raymond J Carroll
Computational Statistics | VOL. 32
Ruth M Pfeiffer, et. al.Ruth M Pfeiffer ... Raymond J Carroll
22 Oct 2016
Computational Statistics | VOL. 32

Validation of Brief Screening Tools for Mental Disorders Among New Zealand Prisoners
C. Evans ... A. I. Simpson
Psychiatric Services | VOL. 61
C. Evans, et. al.C. Evans ... A. I. Simpson
01 Sep 2010
Psychiatric Services | VOL. 61

Systematic reviews of clinical decision tools for acute abdominal pain
J Liu ... S Clamp
Health Technology Assessment | VOL. 10
J Liu, et. al.J Liu ... S Clamp
01 Nov 2006
Health Technology Assessment | VOL. 10

Systematic learning of gene functional classes from DNA array expression data by using multilayer perceptrons.
Alvaro Mateos ... Gustavo Stolovitzky
Genome Research | VOL. 12
Alvaro Mateos, et. al.Alvaro Mateos ... Gustavo Stolovitzky
01 Nov 2002
Genome Research | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Empirical Validation of Pooled Whole Genome Population Re-Sequencing in Drosophila melanogaster

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE