Abstract

Gene-based association tests aggregate genotypes across multiple variants for each gene, providing an interpretable gene-level analysis framework for genome-wide association studies (GWAS). Early gene-based test applications often focused on rare coding variants; a more recent wave of gene-based methods, e.g. TWAS, use eQTLs to interrogate regulatory associations. Regulatory variants are expected to be particularly valuable for gene-based analysis, since most GWAS associations to date are non-coding. However, identifying causal genes from regulatory associations remains challenging and contentious. Here, we present a statistical framework and computational tool to integrate heterogeneous annotations with GWAS summary statistics for gene-based analysis, applied with comprehensive coding and tissue-specific regulatory annotations. We compare power and accuracy identifying causal genes across single-annotation, omnibus, and annotation-agnostic gene-based tests in simulation studies and an analysis of 128 traits from the UK Biobank, and find that incorporating heterogeneous annotations in gene-based association analysis increases power and performance identifying causal genes.

Highlights

  • Genome-wide association studies (GWAS) have identified thousands of genetic loci associated with complex traits [1]; the biological mechanisms underlying these associations are often poorly understood

  • Gene-based association tests are statistical methods used in genome-wide association studies (GWAS) to identify genes that affect heritable traits

  • Gene-based tests are formed by aggregating genotypes across multiple genetic variants for each gene, often including only variants that are likely to affect gene function or regulation

Read more

Summary

Introduction

Genome-wide association studies (GWAS) have identified thousands of genetic loci associated with complex traits [1]; the biological mechanisms underlying these associations are often poorly understood. Gene-based association tests can provide a more interpretable analysis framework compared to single-variant analysis, interrogating association at the gene level by aggregating genotypes across multiple variants for each gene. This strategy can increase power to detect association by aggregating small effects across variants, reducing the burden of multiple testing, and weighting or filtering to prioritize functional variants [2, 3]. Incorporating regulatory variants is expected to be valuable for gene-based analysis of complex traits, since most genetic associations discovered to date are in non-coding regions [9]. While coding variants generally implicate a single known gene, the gene(s) affected by regulatory variants are often less clear [10, 11]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call