Abstract

In multi-cellular organisms, spatiotemporal activity of cis-regulatory DNA elements depends on their occupancy by different transcription factors (TFs). In recent years, genome-wide ChIP-on-Chip, ChIP-Seq and DamID assays have been extensively used to unravel the combinatorial interaction of TFs with cis-regulatory modules (CRMs) in the genome. Even though genome-wide binding profiles are increasingly becoming available for different TFs, single TF binding profiles are in most cases not sufficient for dissecting complex regulatory networks. Thus, potent computational tools detecting statistically significant and biologically relevant TF-motif co-occurrences in genome-wide datasets are essential for analyzing context-dependent transcriptional regulation. We have developed COPS (Co-Occurrence Pattern Search), a new bioinformatics tool based on a combination of association rules and Markov chain models, which detects co-occurring TF binding sites (BSs) on genomic regions of interest. COPS scans DNA sequences for frequent motif patterns using a Frequent-Pattern tree based data mining approach, which allows efficient performance of the software with respect to both data structure and implementation speed, in particular when mining large datasets. Since transcriptional gene regulation very often relies on the formation of regulatory protein complexes mediated by closely adjoining TF binding sites on CRMs, COPS additionally detects preferred short distance between co-occurring TF motifs. The performance of our software with respect to biological significance was evaluated using three published datasets containing genomic regions that are independently bound by several TFs involved in a defined biological process. In sum, COPS is a fast, efficient and user-friendly tool mining statistically and biologically significant TFBS co-occurrences and therefore allows the identification of TFs that combinatorially regulate gene expression.

Highlights

  • Cell-type specific gene expression results from the combinatorial interaction of transcription factors (TFs) with cis-regulatory DNA elements, which are instructed by clusters of TF binding sites (TFBSs) [1,2]

  • Preferred TFBS spacing may indicate formation of regulatory protein complexes mediated by closely adjoining TFBSs (Dbp < 10 bp) [3], indirect interactions mediated by adaptor proteins (Dbp = multiples of 10 bp) and direct/indirect interactions of distant TFs mediated by chromatin structures, i.e. chromatin loops (Dbp = multiples of 100 bp) [4,5]

  • In this study we present COPS (Figure 1), a computational tool that detects statistically significant TFBS co-occurrences in genome-wide datasets consisting of genomic regions bound by a single TF in vivo, as shown by chromatin immunoprecipitation (ChIP)-onChip, ChIP-Seq or DNA-adenine methyltransferase identification (DamID) experiments

Read more

Summary

Introduction

Cell-type specific gene expression results from the combinatorial interaction of transcription factors (TFs) with cis-regulatory DNA elements, which are instructed by clusters of TF binding sites (TFBSs) [1,2]. The presence of TFBSs and their spatial arrangements within cis-regulatory modules (CRMs) is a critical aspect of spatiotemporal regulation of gene expression. Preferred TFBS spacing may indicate formation of regulatory protein complexes mediated by closely adjoining TFBSs (Dbp < 10 bp) [3], indirect interactions mediated by adaptor proteins (Dbp = multiples of 10 bp) and direct/indirect interactions of distant TFs mediated by chromatin structures, i.e. chromatin loops (Dbp = multiples of 100 bp) [4,5]. Prominent examples include the elucidation of the transcriptional networks controlling muscle and nervous system development in Drosophila [6,7] as well as the identification of TF combinations instructing heart development in mammals [8]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call