Abstract

Advances in next-generation sequencing technology have enabled systematic exploration of the contribution of rare variation to Mendelian and complex diseases. Although it is well known that population stratification can generate spurious associations with common alleles, its impact on rare variant association methods remains poorly understood. Here, we performed exhaustive coalescent simulations with demographic parameters calibrated from exome sequence data to evaluate the performance of nine rare variant association methods in the presence of fine-scale population structure. We find that all methods have an inflated spurious association rate for parameter values that are consistent with levels of differentiation typical of European populations. For example, at a nominal significance level of 5%, some test statistics have a spurious association rate as high as 40%. Finally, we empirically assess the impact of population stratification in a large data set of 4,298 European American exomes. Our results have important implications for the design, analysis, and interpretation of rare variant genome-wide association studies.

Highlights

  • Population structure can be a strong confounding factor in association studies [1,2,3,4], and accounting for it can be important, even in cases where seemingly homogeneous ethnic populations are sampled

  • We will refer to the elevation in or inflation of significance rates as the spurious association rate (SAR) throughout the rest of the paper to emphasize the point that population stratification causes genuine associations between genotypes at a locus and a phenotype, but such associations are due to genetic substructure rather than alleles causally related to the trait

  • Rare Variant Association Methods We evaluated nine rare variant association methods: the collapsed x2 test, the collapsed Fisher’s Exact Test (FET), the Weighted Sum Statistic (WSS) [23], Variable Threshold (VT) [24], RareCover [25], and four methods implemented under a logistic regression framework

Read more

Summary

Introduction

Population structure can be a strong confounding factor in association studies [1,2,3,4], and accounting for it can be important, even in cases where seemingly homogeneous ethnic populations are sampled. As common variants have been unable to account for a significant proportion of complex disease heritability [11,12], there is increasing interest in systematically evaluating the contribution of rare variants to disease. To this end, a large number of rare variant association test statistics have been developed (reviewed in Bansal et al [13] and Asimit and Zeggini [14]) and used to identify a growing catalog of rare alleles that may influence disease risk [13,14]. We will refer to the elevation in or inflation of significance rates as the spurious association rate (SAR) throughout the rest of the paper to emphasize the point that population stratification causes genuine associations between genotypes at a locus and a phenotype, but such associations are due to genetic substructure rather than alleles causally related to the trait

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.