Abstract

Functionally altered biological mechanisms arising from disease-associated polymorphisms, remain difficult to characterise when those variants are intergenic, or, fall between genes. We sought to identify shared downstream mechanisms by which inter- and intragenic single-nucleotide polymorphisms (SNPs) contribute to a specific physiopathology. Using computational modelling of 2 million pairs of disease-associated SNPs drawn from genome-wide association studies (GWAS), integrated with expression Quantitative Trait Loci (eQTL) and Gene Ontology functional annotations, we predicted 3,870 inter–intra and inter–intra SNP pairs with convergent biological mechanisms (FDR<0.05). These prioritised SNP pairs with overlapping messenger RNA targets or similar functional annotations were more likely to be associated with the same disease than unrelated pathologies (OR>12). We additionally confirmed synergistic and antagonistic genetic interactions for a subset of prioritised SNP pairs in independent studies of Alzheimer’s disease (entropy P=0.046), bladder cancer (entropy P=0.039), and rheumatoid arthritis (PheWAS case–control P<10−4). Using ENCODE data sets, we further statistically validated that the biological mechanisms shared within prioritised SNP pairs are frequently governed by matching transcription factor binding sites and long-range chromatin interactions. These results provide a ‘roadmap’ of disease mechanisms emerging from GWAS and further identify candidate therapeutic targets among downstream effectors of intergenic SNPs.

Highlights

  • The abundance of newly discovered disease-associated polymorphisms enables inquiries about their summative and interactive effects.[1]

  • Lead single-nucleotide polymorphisms (SNPs) pairs were categorised into three groups based on assertions by dbSNP (Build 138):[27] intergenic–intergenic pairs when both SNPs are at least 2,000 bp 5ʹ and 500 bp 3ʹ of protein-coding gene coordinates, intergenic–intragenic pairs when one SNP is intergenic and the other is within gene coordinates, and intragenic–intragenic pairs in cases where both SNPs were found within or near gene coordinates

  • Further investigation in this direction is Encyclopedia of DNA Elements (ENCODE)) and knowledge base of gene annotations (GO) to supported by our independent prioritisation of SNP pairs impute biological effectors of SNPs derived from their shared associated with liver diseases

Read more

Summary

Introduction

The abundance of newly discovered disease-associated polymorphisms enables inquiries about their summative and interactive effects.[1]. Downstream effects of missense and nonsense coding SNPs can be investigated straightforwardly in cellular and animal models, effects arising from intergenic SNPs remain largely uncharacterised and are often challenging to validate experimentally using in vitro and in vivo assays. We and others have shown that systematically integrating studies of protein–protein interaction with experimentally verified disease-associated coding SNPs enables discovery of new disease-gene candidates and testable associations between biological pathways and disease.[3,4,5,6,7] Other disease-mechanism-based methods have prioritised GWAS signals by leveraging prior biological knowledge inferred from the physical proximity of SNPs to gene loci[8,9,10,11] or from expression quantitative loci (eQTL) associations.[12,13,14,15,16,17] Recent high-throughput genomics projects such as The Encyclopedia of DNA Elements (ENCODE) have extended quantitative measures of biological activity into intergenic regions.[18,19] These projects led to integrative genomic analyses and systemic mapping of diseaseassociated SNPs to regulatory elements, including enhancers, transcription factor (TF) binding sites or chromatin accessibility marks.[20,21,22,23,24,25] analysis of how downstream disease

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call