Abstract

BackgroundThe combinatorial binding of trans-acting factors (TFs) to the DNA is critical to the spatial and temporal specificity of gene regulation. For certain regulatory regions, more than one regulatory module (set of TFs that bind together) are combined to achieve context-specific gene regulation. However, previous approaches are limited to either pairwise TF co-association analysis or assuming that only one module is used in each regulatory region.ResultsWe present a new computational approach that models the modular organization of TF combinatorial binding. Our method learns compact and coherent regulatory modules from in vivo binding data using a topic model. We found that the binding of 115 TFs in K562 cells can be organized into 49 interpretable modules. Furthermore, we found that tens of thousands of regulatory regions use multiple modules, a structure that cannot be observed with previous hard clustering based methods. The modules discovered recapitulate many published protein-protein physical interactions, have consistent functional annotations of chromatin states, and uncover context specific co-binding such as gene proximal binding of NFY + FOS + SP and distal binding of NFY + FOS + USF. For certain TFs, the co-binding partners of direct binding (motif present) differs from those of indirect binding (motif absent); the distinct set of co-binding partners can predict whether the TF binds directly or indirectly with up to 95% accuracy. Joint analysis across two cell types reveals both cell-type-specific and shared regulatory modules.ConclusionsOur results provide comprehensive cell-type-specific combinatorial binding maps and suggest a modular organization of combinatorial binding.

Highlights

  • The combinatorial binding of trans-acting factors (TFs) to the DNA is critical to the spatial and temporal specificity of gene regulation

  • Regulatory Module Discovery (RMD) is based on Hierarchical Dirichlet Processes [30], a Bayesian nonparametric topic model that automatically determines the number of modules based on the complexity of the observed data

  • Gene regulation specificity is orchestrated by the interactions among a complex group of trans-acting factors that we have organized into distinct combinable modules

Read more

Summary

Introduction

The combinatorial binding of trans-acting factors (TFs) to the DNA is critical to the spatial and temporal specificity of gene regulation. Previous studies have found that TFs tend to bind in clusters, which are typically characterized by a large number of TF binding sites in a regulatory region [9,10,11,12]. These co-binding TFs may belong to different functional modules that can be combined in regulatory regions to achieve specific functions. CTCF/cohesin modules may co-occur with enhancer-related modules, promoter-related modules, or both Such module co-occurrences suggest that combinatorial binding of TFs may be organized in a modular hierarchy: a regulatory region may use a combination of

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.