Abstract

BackgroundSamples pooling is a method widely used in studies to reduce costs and labour. DNA sample pooling combined with massive parallel sequencing is a powerful tool for discovering DNA variants (polymorphisms) in large analysing populations, which is the base of such research fields as Genome-Wide Association Studies, evolutionary and population studies, etc. Usage of overlapping pools where each sample is present in multiple pools can enhance the accuracy of polymorphism detection and allow identifying carriers of rare-variants. Surprisingly there is a lack of tools for result interpretation and carrier identification, i.e. for “depooling”.ResultsHere we present s-dePooler, the application for analysis of pooling experiments data. s-dePooler uses the variants information (VCF-file) and the pooling scheme to produce a list of candidate carriers for each polymorphism. We incorporated s-dePooler into a pipeline (dePoP) for automation of pooling analysis. The performance of the pipeline was tested on a synthetic dataset built using the 1000 Genomes Project data, resulting in the successful identification 97% of carriers of polymorphisms present in fewer than ~ 10% of carriers.Conclusionss-dePooler along with dePoP can be used to identify carriers of polymorphisms in overlapping pools, and is compatible with any pooling scheme with equivalent molar ratios of pooled samples. s-dePooler and dePoP with usage instructions and test data are freely available at https://github.com/lab9arriam/depop.

Highlights

  • Samples pooling is a method widely used in studies to reduce costs and labour

  • We developed s-dePooler, an application for the determination of minor allele carriers using the results of Next generation sequencing (NGS) sequencing of overlapping pools, the first application developed for this specific purpose. s-dePooler (i) estimates the most probable numbers of the minor allele copies in each pool (Allele–Pool distribution) based on the numbers of reference and alternative reads and (ii) determines subsets of specimens carrying that minor allele (Allele–Sample distribution) that satisfy the pooling scheme

  • To emulate the mixing of sequencing libraries and the sample pooling, the proportions of pool reads in an emulated lane were generated according to the Dirichlet distribution with all parameters equalling 4; proportions of sample reads in each pool were generated according to the Dirichlet distribution with all parameters equalling 3

Read more

Summary

Introduction

Samples pooling is a method widely used in studies to reduce costs and labour. DNA sample pooling combined with massive parallel sequencing is a powerful tool for discovering DNA variants (polymorphisms) in large analysing populations, which is the base of such research fields as Genome-Wide Association Studies, evolutionary and population studies, etc. Usage of overlapping pools where each sample is present in multiple pools can enhance the accuracy of polymorphism detection and allow identifying carriers of rare-variants. Multiple pools can be used for the correction of sequencing errors [2] Another strategy for enhancing the accuracy of polymorphism detection is to use overlapping pools, a strategy in which each sample is added to multiple pools. Overlapping pools make it possible to identify carriers of polymorphisms; this strategy is extremely valuable for some applications, such as clinical trials

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.