Abstract

Overlap between non-coding DNA regulatory sequences and common variant associations can help to identify specific cell and tissue types that are relevant for particular diseases. In a systematic manner, we analyzed variants from 94 genome-wide association studies (reporting at least 12 loci at p<5x10-8) by projecting them onto 466 epigenetic datasets (characterizing DNase I hypersensitive sites; DHSs) derived from various adult and fetal tissue samples and cell lines including many biological replicates. We were able to confirm many expected associations, such as the involvement of specific immune cell types in immune-related diseases and tissue types in diseases that affect specific organs, for example, inflammatory bowel disease and coronary artery disease. Other notable associations include adrenal glands in coronary artery disease, the immune system in Alzheimer’s disease, and the kidney for bone marrow density. The association signals for some GWAS (for example, myopia or age at menarche) did not show a clear pattern with any of the cell or tissue types studied. In general, the identified variants from GWAS tend to be located outside coding regions. Altogether, we have performed an extensive characterization of GWAS signals in relation to cell and tissue-specific DHSs, demonstrating a key role for regulatory mechanisms in common diseases and complex traits.

Highlights

  • In the last decade, genome-wide association studies (GWAS) identified a plethora of single nucleotide polymorphisms (SNPs) robustly associated to various quantitative traits and complex diseases [1]

  • We have focused on these particular datasets, since DNase I hypersensitive site (DHS) are considered to be one of the best discriminative features [6] between cell types

  • Even though chromatin mark H3 lysine trimethylation (H3K4me3) was shown to be slightly better in predicting “critical” cell types[6], we used DHSs since these were available for a larger number of different cell types and tissues, and importantly, included data for biological replicates (S1C and S1D Fig)

Read more

Summary

Introduction

Genome-wide association studies (GWAS) identified a plethora of single nucleotide polymorphisms (SNPs) robustly associated to various quantitative traits and complex diseases [1]. The vast majority of these SNPs are located outside of coding regions and do not affect the primary sequence of protein coding genes [1]. Due to linkage disequilibrium in the human genome, these SNPs should be considered as markers for PLOS ONE | DOI:10.1371/journal.pone.0165893. Extensive Association of Common Disease Variants with Regulatory Sequence decision to publish, or preparation of the manuscript Due to linkage disequilibrium in the human genome, these SNPs should be considered as markers for PLOS ONE | DOI:10.1371/journal.pone.0165893 November 22, 2016

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call