Abstract
Gene-based association tests aggregate genotypes across multiple variants for each gene, providing an interpretable gene-level analysis framework for genome-wide association studies (GWAS). Early gene-based test applications often focused on rare coding variants; a more recent wave of gene-based methods, e.g. TWAS, use eQTLs to interrogate regulatory associations. Regulatory variants are expected to be particularly valuable for gene-based analysis, since most GWAS associations to date are non-coding. However, identifying causal genes from regulatory associations remains challenging and contentious. Here, we present a statistical framework and computational tool to integrate heterogeneous annotations with GWAS summary statistics for gene-based analysis, applied with comprehensive coding and tissue-specific regulatory annotations. We compare power and accuracy identifying causal genes across single-annotation, omnibus, and annotation-agnostic gene-based tests in simulation studies and an analysis of 128 traits from the UK Biobank, and find that incorporating heterogeneous annotations in gene-based association analysis increases power and performance identifying causal genes.
Highlights
Genome-wide association studies (GWAS) have identified thousands of genetic loci associated with complex traits [1]; the biological mechanisms underlying these associations are often poorly understood
Gene-based association tests are statistical methods used in genome-wide association studies (GWAS) to identify genes that affect heritable traits
Gene-based tests are formed by aggregating genotypes across multiple genetic variants for each gene, often including only variants that are likely to affect gene function or regulation
Summary
Genome-wide association studies (GWAS) have identified thousands of genetic loci associated with complex traits [1]; the biological mechanisms underlying these associations are often poorly understood. Gene-based association tests can provide a more interpretable analysis framework compared to single-variant analysis, interrogating association at the gene level by aggregating genotypes across multiple variants for each gene. This strategy can increase power to detect association by aggregating small effects across variants, reducing the burden of multiple testing, and weighting or filtering to prioritize functional variants [2, 3]. Incorporating regulatory variants is expected to be valuable for gene-based analysis of complex traits, since most genetic associations discovered to date are in non-coding regions [9]. While coding variants generally implicate a single known gene, the gene(s) affected by regulatory variants are often less clear [10, 11]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.