Abstract

We previously showed that disease-linked metabolic genes are often under combinatorial regulation. Using the genome-wide ChIP-Seq binding profiles for 93 transcription factors in nine different cell lines, we show that genes under high regulatory load are significantly enriched for disease-association across cell types. We find that transcription factor load correlates with the enhancer load of the genes and thereby allows the identification of genes under high regulatory load by epigenomic mapping of active enhancers. Identification of the high enhancer load genes across 139 samples from 96 different cell and tissue types reveals a consistent enrichment for disease-associated genes in a cell type-selective manner. The underlying genes are not limited to super-enhancer genes and show several types of disease-association evidence beyond genetic variation (such as biomarkers). Interestingly, the high regulatory load genes are involved in more KEGG pathways than expected by chance, exhibit increased betweenness centrality in the interaction network of liver disease genes, and carry longer 3′ UTRs with more microRNA (miRNA) binding sites than genes on average, suggesting a role as hubs integrating signals within regulatory networks. In summary, epigenetic mapping of active enhancers presents a promising and unbiased approach for identification of novel disease genes in a cell type-selective manner.

Highlights

  • Identification of disease-relevant genes and gene products as biomarkers and drug targets is one of the key tasks of biomedical research

  • To investigate whether this is a general finding across different cell types and gene categories, we took advantage of the numerous ChIP-Seq data sets of transcription factors (TFs) binding produced by the ENCODE project in a number of cell types [19]

  • We show that genes regulated by a high TF load are more likely to be disease-associated genes and can be identified across cell types through epigenomic mapping of active enhancers

Read more

Summary

Introduction

Identification of disease-relevant genes and gene products as biomarkers and drug targets is one of the key tasks of biomedical research. A great majority of research is focused on a small minority of genes while over a third of genes remain unstudied [1]. Unbiased prioritization within these ignored genes would be important to harvest the full potential of genomics in understanding diseases. One of the more comprehensive databases, DisGeNET [4,5], draws from multiple sources as well as text-mining approaches to generate gene-disease networks where genes are associated to diseases by various evidence ranging from altered expression and genetic variation to existing therapeutic association. DisGeNET already links many of the human genes to at least one disease, highlights the multigenetic background of most diseases and how many genes can be associated to multiple diseases [4,5]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call