Abstract

While it is well established that genetics can be a major contributor to population variation of complex traits, the relative contributions of rare and common variants to phenotypic variation remains a matter of considerable debate. Here, we simulate genetic and phenotypic data across different case/control panel sampling strategies, sequencing methods, and genetic architecture models based on evolutionary forces to determine the statistical performance of rare variant association tests (RVATs) widely in use. We find that the highest statistical power of RVATs is achieved by sampling case/control individuals from the extremes of an underlying quantitative trait distribution. We also demonstrate that the use of genotyping arrays, in conjunction with imputation from a whole‐genome sequenced (WGS) reference panel, recovers the vast majority (90%) of the power that could be achieved by sequencing the case/control panel using current tools. Finally, we show that for dichotomous traits, the statistical performance of RVATs decreases as rare variants become more important in the trait architecture. Our results extend previous work to show that RVATs are insufficiently powered to make generalizable conclusions about the role of rare variants in dichotomous complex traits.

Highlights

  • Genome-wide association studies (GWAS) have detected many common variants associated with hundreds of complex heritable phenotypes, but for many traits, much of that heritability remains unexplained

  • We focus on the genetic variance explained by variants with minor allele frequencies (MAF) < 1% (V0.01), which varies dramatically between 99% when ⍴=1, t=1, to less than 1% when ⍴=0, t=0.5

  • With these tests and this data becoming more and more prevalent, we look at how to optimize the design of a rare variant association study to maximize power

Read more

Summary

Introduction

Genome-wide association studies (GWAS) have detected many common variants associated with hundreds of complex heritable phenotypes, but for many traits, much of that heritability remains unexplained. Power to detect rare variant associations is low in single-marker statistical tests at the genome-wide scale. Researchers have proposed many rare variant association tests (RVATs), statistical methods to pool rare variants within a putatively causal locus and test for association with the phenotype. These RVATs are broadly classified into three categories: burden tests (Liu & Leal, 2010), variancecomponent tests (Neale et al, 2011; Wu et al, 2011), and combined tests (Lee et al, 2012; Sun, Zheng, & Hsu, 2013). Though each test is published with its own validation simulations, these simulations are generally not comparable, and have their own flaws. (Moutsianas et al, 2015) systematically characterized the performance of commonly used gene-based rare variant association tests under a range of genetic architectures, sample sizes, variant effect sizes, and significance thresholds, and found that MiST, SKATO, and KBAC have the highest mean power across simulated data, but that these tests had overall low power even in the cases of loci with relatively large effect sizes

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.