Abstract

The identification of cis-acting elements on DNA is crucial for the understanding of the complex regulatory networks that govern many cell mechanisms. However, this task is very complex since it is estimated that there are 1500 different transcription factors (TFs) in the human genome, each of which can bind to multiple loci directly or indirectly. The standard computational approach is the use of a position weight matrix (PWM) to represent the binding preference of a transcription factor and the use of statistical procedures to detect genomic regions with high binding scores. Given the small and degenerate signals of most PWMs, such approach suffers from a very high number of false positive hits. Current research has proven that genome wide assays reflecting open chromatin, such as DNase digestion or histone modifications, can improve sequence based detection of the binding location of transcription factors that are active in a particular cell type. We propose here a Multivariate Hidden Markov Model that is able to improve the prediction of transcription factor binding locations by integrating DNase digestion and histone modification data. Our methodology improves sensitivity, in comparison to existing methods, with little or no effect at specificity rates. This study shows that it is possible to improve predictability power of cis-acting elements by correctly integrating DNase and histone modification data, allowing for more sophisticated studies using a larger set of epigenetic signals.Keywordscis-regulatory elementsDNase I-hypersensitive siteshistone modificationshidden markov models

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call