Abstract

In genetic association studies, rare variants with extremely low allele frequencies play a crucial role in complex traits. Therefore, set-based testing methods that jointly assess the effects of groups of single nucleotide polymorphisms (SNPs) were developed to increase the powers of the association tests. However, these powers are still insufficient, and precise estimations of the effect sizes of individual SNPs are largely impossible. In this article, we provide an efficient set-based statistical inference framework that addresses both of these important issues simultaneously using an empirical Bayes method with semiparametric multilevel mixture modeling. We propose to utilize the hierarchical model that incorporates variations in set-specific effects and to apply the optimal discovery procedure (ODP) that achieves the largest overall power in multiple significance testing. In addition, we provide an optimal "set-based" estimator of the empirical distribution of effect sizes. The efficiency of the proposed methods is demonstrated through application to a genome-wide association study of coronary artery disease and through simulation studies. The results demonstrated numerous rare variants with large effect sizes for coronary artery disease, and the number of significant sets detected by the ODP was much greater than those identified by existingmethods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.