Abstract

Scalable, integrative methods to understand mechanisms that link genetic variants with phenotypes are needed. Here we derive a mathematical expression to compute PrediXcan (a gene mapping approach) results using summary data (S-PrediXcan) and show its accuracy and general robustness to misspecified reference sets. We apply this framework to 44 GTEx tissues and 100+ phenotypes from GWAS and meta-analysis studies, creating a growing public catalog of associations that seeks to capture the effects of gene expression variation on human phenotypes. Replication in an independent cohort is shown. Most of the associations are tissue specific, suggesting context specificity of the trait etiology. Colocalized significant associations in unexpected tissues underscore the need for an agnostic scanning of multiple contexts to improve our ability to detect causal regulatory mechanisms. Monogenic disease genes are enriched among significant associations for related traits, suggesting that smaller alterations of these genes may cause a spectrum of milder phenotypes.

Highlights

  • Scalable, integrative methods to understand mechanisms that link genetic variants with phenotypes are needed

  • There are several reasons that may explain this lack of enrichment: genes identified with GWAS and subsequently with SPrediXcan have rather small effect sizes, so that it would not be surprising that they were missed until very large sample sizes were aggregated; ClinVar genes may originate from rare mutations that are not well covered by our prediction models, which are based on common variation; or the mechanism of action of the schizophrenia linked ClinVar genes may be different than the alteration of expression levels

  • Summary Mendelian Randomization (SMR) quantifies the strength of the association between expression levels of a gene and complex traits with This SMR statistic (TSMR) using the following function of the eQTL and GWAS Z-score statistics

Read more

Summary

Introduction

Integrative methods to understand mechanisms that link genetic variants with phenotypes are needed. We derive a mathematical expression to compute PrediXcan (a gene mapping approach) results using summary data (S-PrediXcan) and show its accuracy and general robustness to misspecified reference sets We apply this framework to 44 GTEx tissues and 100+ phenotypes from GWAS and meta-analysis studies, creating a growing public catalog of associations that seeks to capture the effects of gene expression variation on human phenotypes. The most comprehensive transcriptome dataset, in terms of examined tissues, is the Genotype-Tissue Expression Project (GTEx): a large-scale effort where DNA and RNA were collected from multiple tissue samples from nearly 1000 individuals and sequenced to high coverage[9,10] This remarkable resource provides a comprehensive cross-tissue survey of the functional consequences of genetic variation at the transcript level. In order to harness the power of these increased sample sizes while keeping the computational burden manageable, methods that use summary level data rather than individual level data are needed

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call