Abstract

Genome wide association studies (GWAS) of human diseases have generally identified many loci associated with risk with relatively small effect sizes. The omnigenic model attempts to explain this observation by suggesting that diseases can be thought of as networks, where genes with direct involvement in disease-relevant biological pathways are named 'core genes', while peripheral genes influence disease risk via their interactions or regulatory effects on core genes. Here, we demonstrate a method for identifying candidate core genes solely from genes in or near disease-associated SNPs (GWAS hits) in conjunction with protein-protein interaction network data. Applied to 1,381 GWAS studies from 5 ancestries, we identify a total of 1,865 candidate core genes in 343 GWAS studies. Our analysis identifies several well-known disease-related genes that are not identified by GWAS, including BRCA1 in Breast Cancer, Amyloid Precursor Protein (APP) in Alzheimer's Disease, INS in A1C measurement and Type 2 Diabetes, and PCSK9 in LDL cholesterol, amongst others. Notably candidate core genes are preferentially enriched for disease relevance over GWAS hits and are enriched for both Clinvar pathogenic variants and known drug targets-consistent with the predictions of the omnigenic model. We subsequently use parent term annotations provided by the GWAS catalog, to merge related GWAS studies and identify candidate core genes in over-arching disease processes such as cancer-where we identify 109 candidate core genes.

Highlights

  • Genome-wide association studies (GWAS) have uncovered important insights into the genetic basis of complex traits

  • Our analysis identifies several well-known disease-related genes that are not identified by GWAS, including BRCA1 in Breast Cancer, Amyloid Precursor Protein (APP) in Alzheimer’s Disease, INS in A1C measurement and Type 2 Diabetes, and PCSK9 in LDL cholesterol, amongst others

  • protein-protein interactions (PPI) is related to the number of GWAS hits, studies with excess PPI were not limited to those with high numbers of GWAS hits (Fig 1D)

Read more

Summary

Introduction

Genome-wide association studies (GWAS) have uncovered important insights into the genetic basis of complex traits. Contrary to initial expectation, only a few large effect sizevariants with mechanistic links to disease have been identified from these studies [1,2,3,4,5,6,7,8]. ‘Core genes’ which have direct effects on pathways central to pathogenesis (e.g. synaptic genes in schizophrenia [9]) are located at the center of the network, while peripheral genes contribute to disease risk via their influence on core genes. Network inference and exome sequencing to identify deleterious, rare variants, have been proposed for identifying core genes [9] directly, these methods have limitations. The sample size required for exome sequencing to identify rare variants remains uncertain [11] and network inference methods require further technical development [9]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call