Abstract

Underlying cellular responses is a transcriptional regulatory network (TRN) that modulates gene expression. A useful description of the TRN would decompose the transcriptome into targeted effects of individual transcriptional regulators. Here, we apply unsupervised machine learning to a diverse compendium of over 250 high-quality Escherichia coli RNA-seq datasets to identify 92 statistically independent signals that modulate the expression of specific gene sets. We show that 61 of these transcriptomic signals represent the effects of currently characterized transcriptional regulators. Condition-specific activation of signals is validated by exposure of E. coli to new environmental conditions. The resulting decomposition of the transcriptome provides: a mechanistic, systems-level, network-based explanation of responses to environmental and genetic perturbations; a guide to gene and regulator function discovery; and a basis for characterizing transcriptomic differences in multiple strains. Taken together, our results show that signal summation describes the composition of a model prokaryotic transcriptome.

Highlights

  • Underlying cellular responses is a transcriptional regulatory network (TRN) that modulates gene expression

  • To assemble PRECISE, we collected and processed RNA-seq data from over 15 studies published by our research group, comprising ~20% of all publicly available RNA-seq data in NCBI GEO33 for E. coli K-12 MG1655 and BW25113 (Supplementary Fig. 1c)

  • We have demonstrated that the combination of (1) independent component analysis (ICA) of highquality RNA-seq data and (2) high-resolution comprehensive regulator-binding site information, identifies linear combinations of quantitative regulatory signals that reconstitute the E. coli transcriptome, leading to the first E. coli TRN inferred from an RNA-seq compendium

Read more

Summary

Introduction

Underlying cellular responses is a transcriptional regulatory network (TRN) that modulates gene expression. The resulting decomposition of the transcriptome provides: a mechanistic, systems-level, network-based explanation of responses to environmental and genetic perturbations; a guide to gene and regulator function discovery; and a basis for characterizing transcriptomic differences in multiple strains. Environmental and genetic perturbations alter the activity states of transcriptional regulators to change their DNA-binding affinity[8], which in turn modulates the transcriptome in a condition-specific manner[9]. A measured expression profile reflects a combination of the activities of all transcriptional regulators under the examined condition. This poses the fundamental deconvolution challenge of separating the conditioninvariant network structure from its condition-dependent expression state on a genome scale. A comprehensive review of 42 module detection methods showed that independent component analysis (ICA), a signal deconvolution algorithm, outperformed all other algorithms in identifying groups of coregulated genes[12]

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.