Abstract

Rapid advances in sequencing technologies set the stage for the large-scale medical sequencing efforts to be performed in the near future, with the goal of assessing the importance of rare variants in complex diseases. The discovery of new disease susceptibility genes requires powerful statistical methods for rare variant analysis. The low frequency and the expected large number of such variants pose great difficulties for the analysis of these data. We propose here a robust and powerful testing strategy to study the role rare variants may play in affecting susceptibility to complex traits. The strategy is based on assessing whether rare variants in a genetic region collectively occur at significantly higher frequencies in cases compared with controls (or vice versa). A main feature of the proposed methodology is that, although it is an overall test assessing a possibly large number of rare variants simultaneously, the disease variants can be both protective and risk variants, with moderate decreases in statistical power when both types of variants are present. Using simulations, we show that this approach can be powerful under complex and general disease models, as well as in larger genetic regions where the proportion of disease susceptibility variants may be small. Comparisons with previously published tests on simulated data show that the proposed approach can have better power than the existing methods. An application to a recently published study on Type-1 Diabetes finds rare variants in gene IFIH1 to be protective against Type-1 Diabetes.

Highlights

  • Common diseases such as diabetes, heart disease, schizophrenia, etc., are likely caused by a complex interplay among many genes and environmental factors

  • We present here a novel testing strategy, based on a weighted-sum statistic, that is less sensitive than existing methods to the presence of both risk and protective variants in the genetic region under investigation

  • We introduce here a new testing strategy, which we call replication-based strategy, and which is based on a weighted-sum statistic, but that has the advantage of being less sensitive to the presence of both risk and protective variants in a genetic region of interest

Read more

Summary

Introduction

Common diseases such as diabetes, heart disease, schizophrenia, etc., are likely caused by a complex interplay among many genes and environmental factors. Common and rare variants could both be important contributors to disease risk. In a first attempt to find disease susceptibility loci, most research has focused on the discovery of common susceptibility variants. This effort has been helped by the widespread availability of genome-wide arrays providing almost complete genomic coverage for common variants. The genomewide association studies performed so far have led to the discovery of many common variants reproducibly associated with various complex traits, showing that common variants can affect risk to common diseases [2,3]. The estimated effect sizes for these variants are small (most odds ratios are below 1:5), with only a small fraction of trait heritability explained by these variants [4]. One possible explanation for this missing heritability is that, in addition to common variants, rare variants are important

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call