Abstract
Detecting in vivo transcription factor (TF) binding is important for understanding gene regulatory circuitries. ChIP-seq is a powerful technique to empirically define TF binding in vivo. However, the multitude of distinct TFs makes genome-wide profiling for them all labor-intensive and costly. Algorithms for in silico prediction of TF binding have been developed, based mostly on histone modification or DNase I hypersensitivity data in conjunction with DNA motif and other genomic features. However, technical limitations of these methods prevent them from being applied broadly, especially in clinical settings. We conducted a comprehensive survey involving multiple cell lines, TFs, and methylation types and found that there are intimate relationships between TF binding and methylation level changes around the binding sites. Exploiting the connection between DNA methylation and TF binding, we proposed a novel supervised learning approach to predict TF–DNA interaction using data from base-resolution whole-genome methylation sequencing experiments. We devised beta-binomial models to characterize methylation data around TF binding sites and the background. Along with other static genomic features, we adopted a random forest framework to predict TF–DNA interaction. After conducting comprehensive tests, we saw that the proposed method accurately predicts TF binding and performs favorably versus competing methods.
Highlights
A fundamental goal of functional genomic research is to understand gene regulation
More recent observations have implicated the iterative oxidation of 5mC to 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) and 5-carboxylcystosine (5caC) in pathways that serve to offset 5mC levels and facilitate transcription factor (TF) binding [30]. All these findings indicate that DNA methylation levels offer clues as to whether TF binding occurred at a particular locus, which may be exploited as an alternative to the DNase I or histone data for the purpose of predicting TF binding in vivo
We developed Methylphet, a novel computational method and software package to predict TF binding using a combination of methylation profiles and genomic features
Summary
A fundamental goal of functional genomic research is to understand gene regulation. Gene expression can be controlled by epigenetic mechanisms via the coordinated binding of transcription factors (TFs), histone modifications, and DNA methylation [1]. An important first step toward deciphering the complexities of gene regulatory networks is detecting the activities of functional elements, such as TF binding sites in the genome. Advances in high-throughput sequencing technologies such as ChIP-seq [2,3,4] and ChIP-exo [5] allow the comprehensive genome-wide profiling of protein–DNA binding sites. Individual profiling of TF binding is a challenge in clinical settings because the amount of biological materials available is often limited. For these reasons, developing in silico approaches to predict in vivo TF binding sites that do not rely on ChIP-seq is desirable
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.