Abstract

BackgroundRecent genomic scale survey of epigenetic states in the mammalian genomes has shown that promoters and enhancers are correlated with distinct chromatin signatures, providing a pragmatic way for systematic mapping of these regulatory elements in the genome. With rapid accumulation of chromatin modification profiles in the genome of various organisms and cell types, this chromatin based approach promises to uncover many new regulatory elements, but computational methods to effectively extract information from these datasets are still limited.ResultsWe present here a supervised learning method to predict promoters and enhancers based on their unique chromatin modification signatures. We trained Hidden Markov models (HMMs) on the histone modification data for known promoters and enhancers, and then used the trained HMMs to identify promoter or enhancer like sequences in the human genome. Using a simulated annealing (SA) procedure, we searched for the most informative combination and the optimal window size of histone marks.ConclusionCompared with the previous methods, the HMM method can capture the complex patterns of histone modifications particularly from the weak signals. Cross validation and scanning the ENCODE regions showed that our method outperforms the previous profile-based method in mapping promoters and enhancers. We also showed that including more histone marks can further boost the performance of our method. This observation suggests that the HMM is robust and is capable of integrating information from multiple histone marks. To further demonstrate the usefulness of our method, we applied it to analyzing genome wide ChIP-Seq data in three mouse cell lines and correctly predicted active and inactive promoters with positive predictive values of more than 80%. The software is available at .

Highlights

  • Recent genomic scale survey of epigenetic states in the mammalian genomes has shown that promoters and enhancers are correlated with distinct chromatin signatures, providing a pragmatic way for systematic mapping of these regulatory elements in the genome

  • The predictions made by the profile based method of Heintzman et al are labeled in green and the predictions made by the hidden Markov model (HMM) developed in this study are in red. (B) Enhancer prediction using chromatin signature

  • We present here an HMM method to predict promoters and enhancers using their characteristic histone modification patterns

Read more

Summary

Introduction

Recent genomic scale survey of epigenetic states in the mammalian genomes has shown that promoters and enhancers are correlated with distinct chromatin signatures, providing a pragmatic way for systematic mapping of these regulatory elements in the genome. BMC Bioinformatics 2008, 9:547 http://www.biomedcentral.com/1471-2105/9/547 throughput experimental approach has recently been used to tackle this problem and it involves the chromatin immunoprecipitation assay followed by microarray (ChIP-chip)[3,4] or large scale sequencing (ChIP-Seq)[58] This approach is still limited by the availability of antibody recognizing individual TFs at different regulatory elements. Another method involves comparative genomic analysis of related genomes[9,10] and clustering of multiple sequence motifs[11,12,13] This approach has been successfully applied to a number of eukaryotic genomes including yeast, Drosophila and mammal genomes (see review, for example, [14]). These methods rely on precise alignment of regulatory elements across multiple genomes which is not necessarily true for all elements, or prior knowledge of a set of cooperative TFs which is not always available

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call