Abstract
The role of rare genetic variation in the etiology of complex disease remains unclear. However, the development of next-generation sequencing technologies offers the experimental opportunity to address this question. Several novel statistical methodologies have been recently proposed to assess the contribution of rare variation to complex disease etiology. Nevertheless, no empirical estimates comparing their relative power are available. We therefore assessed the parameters that influence their statistical power in 1,998 individuals Sanger-sequenced at seven genes by modeling different distributions of effect, proportions of causal variants, and direction of the associations (deleterious, protective, or both) in simulated continuous trait and case/control phenotypes. Our results demonstrate that the power of recently proposed statistical methods depend strongly on the underlying hypotheses concerning the relationship of phenotypes with each of these three factors. No method demonstrates consistently acceptable power despite this large sample size, and the performance of each method depends upon the underlying assumption of the relationship between rare variants and complex traits. Sensitivity analyses are therefore recommended to compare the stability of the results arising from different methods, and promising results should be replicated using the same method in an independent sample. These findings provide guidance in the analysis and interpretation of the role of rare base-pair variation in the etiology of complex traits and diseases.
Highlights
There is evidence that rare variants can contribute to the etiology of complex disease
Generation sequencing technologies have enabled their detection in large cohorts, and new statistical methods have been proposed to ascertain their association with complex diseases and traits in order to improve power over single-marker analysis
We sought to compare the power of commonly used and novel statistical methods for rare variants using Sanger sequencing data from 1,998 individuals sequenced at 7 genes by simulating several phenotypes under models spanning a spectrum of the common hypotheses concerning such associations
Summary
There is growing evidence that rare variants contribute to the etiology of complex diseases [1,2,3,4]. It has been demonstrated that rare coding variants associated with complex traits are sometimes causal through amino acid substitution [3,8,9] For these reasons, rare variants hold promise as a source of heritability which is not explained by common base-pair variants. Over the past two years, a growing body of methods [2,10,11,12,13,14,15,16, 17,18,19,20] seeking to overcome this limitation has emerged. These methods generally employ three main strategies: collapsing markers across a region, weighting and/or prioritizing markers, and distribution-based approaches
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have