Abstract
Gene expression governs cell fate, and is regulated via a complex interplay of transcription factors and molecules that change chromatin structure. Advances in sequencing-based assays have enabled investigation of these processes genome-wide, leading to large datasets that combine information on the dynamics of gene expression, transcription factor binding and chromatin structure as cells differentiate. While numerous studies focus on the effects of these features on broader gene regulation, less work has been done on the mechanisms of gene-specific transcriptional control. In this study, we have focussed on the latter by integrating gene expression data for the in vitro differentiation of murine ES cells to macrophages and cardiomyocytes, with dynamic data on chromatin structure, epigenetics and transcription factor binding. Combining a novel strategy to identify communities of related control elements with a penalized regression approach, we developed individual models to identify the potential control elements predictive of the expression of each gene. Our models were compared to an existing method and evaluated using the existing literature and new experimental data from embryonic stem cell differentiation reporter assays. Our method is able to identify transcriptional control elements in a gene specific manner that reflect known regulatory relationships and to generate useful hypotheses for further testing.
Highlights
The fate of a cell is determined by dynamics in the expression of genes, a process that is regulated at the highest level by the control of transcription [1, 2]
In order to investigate this further, we considered existing datasets of enhancers from disparate sources including a dataset validated using in-vivo screening (VISTA) [58], two sets that were collated based on transcription factors (TFs) binding and further experimental validation (Schutte et al and Dogan et al.) [22, 24], a set of super-enhancers identified using genome-wide binding profiles of TFs along with Mediator (Whyte et al) [53] and a dataset generated through integrative selection from various different NGS resources (SEA) [59]
Transcriptional regulation can be investigated at the level of a single gene, where studies lead to detailed understanding of all or most relevant cis-control elements, or at genome-scale where high-throughput studies can reveal many general aspects of regulation
Summary
The fate of a cell is determined by dynamics in the expression of genes, a process that is regulated at the highest level by the control of transcription [1, 2]. With the recent developments in high throughput sequencing (HTS) researchers have been able to study the genome-wide implications of these processes in various cell types and organisms [6] From these studies, we have gained global insights into transcriptional regulation, such as the relationship between chromatin accessibility around the promoter region and gene expression [7, 8], the prevalence of histone modifications such as H3K27ac and H3K9ac near expressed genes [9], the presence of H3K27me modification near transcriptionally repressed genes [10] and the binding of master regulators to genes that are often associated with lineage differentiation [11, 12]. This aspect of combinatorial binding of TFs to control regions is well established [12, 20, 21]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.