Identifying the combinatorial control of signal-dependent transcription factors.

Ning Wang,Diane Lefaudeux,Anup Mazumder,Jingyi Jessica Li,Alexander Hoffmann,Sushmita Roy

doi:10.1371/journal.pcbi.1009095

Ning Wang, Diane Lefaudeux + Show 4 more

Open Access

https://doi.org/10.1371/journal.pcbi.1009095

Copy DOI

Abstract

The effectiveness of immune responses depends on the precision of stimulus-responsive gene expression programs. Cells specify which genes to express by activating stimulus-specific combinations of stimulus-induced transcription factors (TFs). Their activities are decoded by a gene regulatory strategy (GRS) associated with each response gene. Here, we examined whether the GRSs of target genes may be inferred from stimulus-response (input-output) datasets, which remains an unresolved model-identifiability challenge. We developed a mechanistic modeling framework and computational workflow to determine the identifiability of all possible combinations of synergistic (AND) or non-synergistic (OR) GRSs involving three transcription factors. Considering different sets of perturbations for stimulus-response studies, we found that two thirds of GRSs are easily distinguishable but that substantially more quantitative data is required to distinguish the remaining third. To enhance the accuracy of the inference with timecourse experimental data, we developed an advanced error model that avoids error overestimates by distinguishing between value and temporal error. Incorporating this error model into a Bayesian framework, we show that GRS models can be identified for individual genes by considering multiple datasets. Our analysis rationalizes the allocation of experimental resources by identifying most informative TF stimulation conditions. Applying this computational workflow to experimental data of immune response genes in macrophages, we found that a much greater fraction of genes are combinatorially controlled than previously reported by considering compensation among transcription factors. Specifically, we revealed that a group of known NFκB target genes may also be regulated by IRF3, which is supported by chromatin immuno-precipitation analysis. Our study provides a computational workflow for designing and interpreting stimulus-response gene expression studies to identify underlying gene regulatory strategies and further a mechanistic understanding.

Highlights

A primary goal of biology is to understand biological phenomena in terms of the underlying factors, whether these are cells, molecules or genes
In this work we address the question: to what extent are combinatorial transcription factor regulatory strategies identifiable from stimulus-response datasets? We present a computational framework to determine the identifiability of gene regulatory strategies, and examine how reliable and quantitative model inference is a function of the quality and quantity of available data
We apply the workflow to immune response datasets and uncover evidence that many more genes are subject to combinatorial control than previously thought; we offer physical transcription factor binding data to support this finding for one particular group of genes

Summary

Introduction

A primary goal of biology is to understand biological phenomena in terms of the underlying factors, whether these are cells, molecules or genes. These factors form dynamic regulatory networks whose emergent properties are responsible for biological phenomena. The systems biology approach employs mathematical models that represent or abstract these networks to interpret experimental data. For studies of how genes are expressed, the advent of experimental assays that are capable of producing genome-wide measurements of mRNA abundance, chromatin-bound factors and modifications has been revolutionary. Because correlative approaches often leverage the statistical power of multiple datapoints from expressed genes, they are not well suited in addressing the regulatory precision of individual genes [7]. That is an important limitation, as many pathological conditions can be traced to a single gene culprit, or a handful [8]

Methods

Results

Discussion

Conclusion