Abstract

Background. Recent epigenomic studies have shown that the length of a DNA region covered by an epigenetic mark is not just a byproduct of the assaying technologies and has functional implications for that locus. For example, expanded regions of DNA sequences that are marked by enhancer-specific histone modifications, such as acetylation of histone H3 lysine 27 (H3K27ac) domains coincide with cell-specific enhancers, known as super or stretch enhancers. Similarly, promoters of genes critical for cell-specific functions are marked by expanded H3K4me3 domains in the cognate cell type, and these can span DNA regions from 4–5kb up to 40–50kb in length. These expanded H3K4me3 domains are known as buffer domains or super promoters.Methods. To ask what correlates with—and potentially regulates—the length of loci marked with these two important histone marks, H3K4me3 and H3K27ac, we built Random Forest regression models. With these models, we computationally identified genomic and epigenomic patterns that are predictive for the length of these marks in seven ENCODE cell lines.Results. We found that certain epigenetic marks and transcription factors explain the variability of the length of H3K4me3 and H3K27ac marks across different cell types, which implies that the lengths of these two epigenetic marks are tightly regulated in a given cell type. Our source code for the regression models and data can be found at our GitHub page: https://github.com/zubekj/broad_peaks.Discussion. Our Random Forest based regression models enabled us to estimate the individual contribution of different epigenetic marks and protein binding patterns to the length of H3K4me3 and H3K27ac deposition patterns, therefore potentially revealing genomic signatures at cell specific regulatory elements.

Highlights

  • Epigenomics refer to the heritable changes that are not stemming from the changes in the genomic DNA sequence

  • The span of epigenetic mark deposition has gained attention in recent years with several studies, including ours, showing that long stretches of DNA marked with H3K4me3 or H3K27ac coincide with cell-type specific promoters or enhancers, respectively (Hnisz et al, 2013; Chapuy et al, 2013; Parker et al, 2013; Benayoun et al, 2014; Bernstein, Meissner & Lander, 2007)

  • Among the predictors we identified, CHD1-binding is important for the prediction of both H3K27ac and H3K4me3 domain lengths

Read more

Summary

Introduction

Epigenomics refer to the heritable changes that are not stemming from the changes in the genomic DNA sequence. In a recent publication, we showed that longer domains (from 5 to 50 kb) of histone H3 lysine 4 trimethylation (H3K4me3) preferentially mark genes associated with cell identity and function in diverse cell types and organisms (Benayoun et al, 2014). To ask what correlates with—and potentially regulates—the length of loci marked with these two important histone marks, H3K4me and H3K27ac, we built Random Forest regression models With these models, we computationally identified genomic and epigenomic patterns that are predictive for the length of these marks in seven ENCODE cell lines. Our Random Forest based regression models enabled us to estimate the individual contribution of different epigenetic marks and protein binding patterns to the length of H3K4me and H3K27ac deposition patterns, potentially revealing genomic signatures at cell specific regulatory elements

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.