Abstract

Genome-wide association studies (GWAS) have implicated 58 loci in coronary artery disease (CAD). However, the biological basis for these associations, the relevant genes, and causative variants often remain uncertain. Since the vast majority of GWAS loci reside outside coding regions, most exert regulatory functions. Here we explore the complexity of each of these loci, using tissue specific RNA sequencing data from GTEx to identify genes that exhibit altered expression patterns in the context of GWAS-significant loci, expanding the list of candidate genes from the 75 currently annotated by GWAS to 245, with almost half of these transcripts being non-coding. Tissue specific allelic expression imbalance data, also from GTEx, allows us to uncover GWAS variants that mark functional variation in a locus, e.g., rs7528419 residing in the SORT1 locus, in liver specifically, and rs72689147 in the GUYC1A1 locus, across a variety of tissues. We consider the GWAS variant rs1412444 in the LIPA locus in more detail as an example, probing tissue and transcript specific effects of genetic variation in the region. By evaluating linkage disequilibrium (LD) between tissue specific eQTLs, we reveal evidence for multiple functional variants within loci. We identify 3 variants (rs1412444, rs1051338, rs2250781) that when considered together, each improve the ability to account for LIPA gene expression, suggesting multiple interacting factors. These results refine the assignment of 58 GWAS loci to likely causative variants in a handful of cases and for the remainder help to re-prioritize associated genes and RNA isoforms, suggesting that ncRNAs maybe a relevant transcript in almost half of CAD GWAS results. Our findings support a multi-factorial system where a single variant can influence multiple genes and each genes is regulated by multiple variants.

Highlights

  • Genome-wide association studies (GWAS) have identified dozens of genetic variants (SNPs) associated with cardiovascular disease risk and related clinical phenotypes

  • To assign GWAS variants to target genes, we determine for each of the GWAS SNPs whether it appears as an eQTL or sQTL reported by Genotype and Tissue Expression Project (GTEx), searching all available tissues

  • We consider the corresponding gene for any transcript that physically overlaps the GWAS variant regardless of strand, incorporating coding, non-coding, and antisense genes. Using these three approaches, we expand the list of potential candidate genes for the 58 GWAS loci from 75 to 245 (Fig 1A, S1 File, comprehensive table is included in S3 File, S1 Fig)

Read more

Summary

Introduction

To interpret and refine GWAS results for coronary artery disease (CAD), we use RNA expression, in addition to physical position, to prioritize the variants and gene(s) most likely to be relevant. SNPs located within RNA exons may alter the protein sequence and influence RNA structure and function in a transcript specific manner [8] Some of these GWAS loci consist of gene clusters that are coordinately regulated [9], and almost all include multiple RNA isoforms expressed from a given gene, including splice isoforms. While confirming 47 of the 48 previously identified loci, this study identified an additional 10 at genome-wide significance, bringing the total count of CAD associated loci to 58 Each of these loci are based on robust statistical associations for one or more SNPs in the locus. We consider each of these 58 loci in detail, using QTL and position to re-prioritize candidate genes and focusing on a subset of loci, to begin resolving inherent complexities of genomic architecture

Materials and methods
Results and discussion
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call