Abstract

BackgroundWhile genome-wide associations studies (GWAS) have successfully elucidated the genetic architecture of complex human traits and diseases, understanding mechanisms that lead from genetic variation to pathophysiology remains an important challenge. Methods are needed to systematically bridge this crucial gap to facilitate experimental testing of hypotheses and translation to clinical utility.ResultsHere, we leveraged cross-phenotype associations to identify traits with shared genetic architecture, using linkage disequilibrium (LD) information to accurately capture shared SNPs by proxy, and calculate significance of enrichment. This shared genetic architecture was examined across differing biological scales through incorporating data from catalogs of clinical, cellular, and molecular GWAS. We have created an interactive web database (interactive Cross-Phenotype Analysis of GWAS database (iCPAGdb)) to facilitate exploration and allow rapid analysis of user-uploaded GWAS summary statistics. This database revealed well-known relationships among phenotypes, as well as the generation of novel hypotheses to explain the pathophysiology of common diseases. Application of iCPAGdb to a recent GWAS of severe COVID-19 demonstrated unexpected overlap of GWAS signals between COVID-19 and human diseases, including with idiopathic pulmonary fibrosis driven by the DPP9 locus. Transcriptomics from peripheral blood of COVID-19 patients demonstrated that DPP9 was induced in SARS-CoV-2 compared to healthy controls or those with bacterial infection. Further investigation of cross-phenotype SNPs associated with both severe COVID-19 and other human traits demonstrated colocalization of the GWAS signal at the ABO locus with plasma protein levels of a reported receptor of SARS-CoV-2, CD209 (DC-SIGN). This finding points to a possible mechanism whereby glycosylation of CD209 by ABO may regulate COVID-19 disease severity.ConclusionsThus, connecting genetically related traits across phenotypic scales links human diseases to molecular and cellular measurements that can reveal mechanisms and lead to novel biomarkers and therapeutic approaches. The iCPAGdb web portal is accessible at http://cpag.oit.duke.edu and the software code at https://github.com/tbalmat/iCPAGdb.

Highlights

  • While genome-wide associations studies (GWAS) have successfully elucidated the genetic architecture of complex human traits and diseases, understanding mechanisms that lead from genetic variation to pathophysiology remains an important challenge

  • The original CPAG used only 14198 SNP-trait associations for 887 traits from the NHGRI-EBI GWAS catalog. Beyond this large expansion in traits and associations, we improved on the original CPAG algorithm by clumping GWAS data from each study (Additional file 2: Figure S1), creating a database of linkage disequilibrium (LD) values based on 1000 Genomes [21], allowing selection of either European, African, or Asian LD structure, and efficiently capturing cross-phenotype associations that are driven by LD proxy (Fig. 1b)

  • ICPAGdb first selects the lead SNPs from all associated loci at a selected p value threshold (p < 5 × 10−8 was used for analysis of the NHGRI-EBI GWAS catalog; Additional file 3: Table S1; Additional file 4: Figure S2)

Read more

Summary

Introduction

While genome-wide associations studies (GWAS) have successfully elucidated the genetic architecture of complex human traits and diseases, understanding mechanisms that lead from genetic variation to pathophysiology remains an important challenge. The second approach is colocalization, which estimates how well the GWAS signals from two signals overlap in a given region while revealing plausibility of individual causal variants [7] These two methods have successfully identified novel genetic connections across distant traits as well as pleiotropic genomic regions but have generally been used independently of each other. Valuable websites, including PhenoScanner [9], GRASP [10], and GeneATLAS [11], have integrated thousands of GWAS studies with billions of SNP-traits associations and allow users to query individual SNPs across the phenome Such PheWAS approaches do not leverage shared genetic architecture that extends beyond individual SNPs and do not take advantage of LD information

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call