Abstract

Motivation: The invention of next-generation sequencing technology has made it possible to study the rare variants that are more likely to pinpoint causal disease genes. To make such experiments financially viable, DNA samples from several subjects are often pooled before sequencing. This induces large between-pool variation which, together with other sources of experimental error, creates over-dispersed data. Statistical analysis of pooled sequencing data needs to appropriately model this additional variance to avoid inflating the false-positive rate.Results: We propose a new statistical method based on an extra-binomial model to address the over-dispersion and apply it to pooled case-control data. We demonstrate that our model provides a better fit to the data than either a standard binomial model or a traditional extra-binomial model proposed by Williams and can analyse both rare and common variants with lower or more variable pool depths compared to the other methods.Availability: Package ‘extraBinomial’ is on http://cran.r-project.org/Contact: chris.wallace@cimr.cam.ac.ukSupplementary information: Supplementary data are available at Bioinformatics Online.

Highlights

  • To date, numerous common genetic variants associated with common disease [e.g. Type 1 diabetes (T1D)] have been successfully discovered by genome-wide association studies (Barrett et al, 2009; Cooper et al, 2008; Smyth et al, 2006; Wellcome Trust Case Control Consortium, 2007)

  • Single-nucleotide polymorphisms (SNPs) located in the interferon induced with helicase C domain 1 (IFIH1) and the C-type lectin domain family 16, member A (CLEC16A) genes, displayed different frequencies in cases and controls according to Fisher’s exact test, and evidence for association of two of these variants within the gene IFIH1 was replicated in independent samples

  • We extend our study beyond the original minor allele frequency (MAF) range used by Nejentsev et al (1–3%) to include all observed single-nucleotide variation

Read more

Summary

Introduction

Numerous common genetic variants associated with common disease [e.g. Type 1 diabetes (T1D)] have been successfully discovered by genome-wide association studies (Barrett et al, 2009; Cooper et al, 2008; Smyth et al, 2006; Wellcome Trust Case Control Consortium, 2007). Single-nucleotide polymorphisms (SNPs) located in the interferon induced with helicase C domain 1 (IFIH1) and the C-type lectin domain family 16, member A (CLEC16A) genes, displayed different frequencies in cases and controls according to Fisher’s exact test, and evidence for association of two of these variants within the gene IFIH1 was replicated in independent samples. This directly implicates IFIH1 as causal in T1D

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.