Abstract

DNase I hypersensitive sites (DHSs) define the accessible chromatin landscape and have revolutionised the discovery of distinct cis-regulatory elements in diverse organisms. Here, we report the first comprehensive map of human transcription factor binding site (TFBS)-clustered regions using Gaussian kernel density estimation based on genome-wide mapping of the TFBSs in 133 human cell and tissue types. Approximately 1.6 million distinct TFBS-clustered regions, collectively spanning 27.7% of the human genome, were discovered. The TFBS complexity assigned to each TFBS-clustered region was highly correlated with genomic location, cell selectivity, evolutionary conservation, sequence features, and functional roles. An integrative analysis of these regions using ENCODE data revealed transcription factor occupancy, transcriptional activity, histone modification, DNA methylation, and chromatin structures that varied based on TFBS complexity. Furthermore, we found that we could recreate lineage-branching relationships by simple clustering of the TFBS-clustered regions from terminally differentiated cells. Based on these findings, a model of transcriptional regulation determined by TFBS complexity is proposed.

Highlights

  • An integrative analysis of transcription factor binding site (TFBS)-clustered regions reveals new transcriptional regulation models on the accessible chromatin landscape

  • Consistent with previous studies[10,12,13,17], the TFBSs were highly clustered in distinct human cell types; 91% of the TFBSs were located in only 0.8% of the genome (Fig. 1A)

  • It was generated via genome-wide mapping of the TFBSs in 133 human cell and tissue types using a computational method based on Gaussian kernel density estimation

Read more

Summary

Introduction

An integrative analysis of TFBS-clustered regions reveals new transcriptional regulation models on the accessible chromatin landscape. DNase I hypersensitive sites (DHSs) define the accessible chromatin landscape and have revolutionised the discovery of distinct cis-regulatory elements in diverse organisms. We report the first comprehensive map of human transcription factor binding site (TFBS)-clustered regions using Gaussian kernel density estimation based on genome-wide mapping of the TFBSs in 133 human cell and tissue types. An integrative analysis of these regions using ENCODE data revealed transcription factor occupancy, transcriptional activity, histone modification, DNA methylation, and chromatin structures that varied based on TFBS complexity. We found that we could recreate lineage-branching relationships by simple clustering of the TFBS-clustered regions from terminally differentiated cells Based on these findings, a model of transcriptional regulation determined by TFBS complexity is proposed. Because TFBSs are hypersensitive to DNase I and are located in only a fraction of the human genome[18], TF motif discovery at DHSs can greatly increase the speed with which TFs can locate their binding sites, and can significantly extend the repertoire of TFs in the human genome

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.