Abstract

Stochastic fluctuations in gene expression give rise to distributions of protein levels across cell populations. Despite a mounting number of theoretical models explaining stochasticity in protein expression, we lack a robust, efficient, assumption-free approach for inferring the molecular mechanisms that underlie the shape of protein distributions. Here we propose a method for inferring sets of biochemical rate constants that govern chromatin modification, transcription, translation, and RNA and protein degradation from stochasticity in protein expression. We asked whether the rates of these underlying processes can be estimated accurately from protein expression distributions, in the absence of any limiting assumptions. To do this, we (1) derived analytical solutions for the first four moments of the protein distribution, (2) found that these four moments completely capture the shape of protein distributions, and (3) developed an efficient algorithm for inferring gene expression rate constants from the moments of protein distributions. Using this algorithm we find that most protein distributions are consistent with a large number of different biochemical rate constant sets. Despite this degeneracy, the solution space of rate constants almost always informs on underlying mechanism. For example, we distinguish between regimes where transcriptional bursting occurs from regimes reflecting constitutive transcript production. Our method agrees with the current standard approach, and in the restrictive regime where the standard method operates, also identifies rate constants not previously obtainable. Even without making any assumptions we obtain estimates of individual biochemical rate constants, or meaningful ratios of rate constants, in 91% of tested cases. In some cases our method identified all of the underlying rate constants. The framework developed here will be a powerful tool for deducing the contributions of particular molecular mechanisms to specific patterns of gene expression.

Highlights

  • Stochasticity in transcription and translation produces fluctuations in both RNA [1,2,3,4], and protein [5,6,7,8,9,10,11,12,13,14,15,16]

  • To better grasp what Analytically Constrained Exhaustive Search (ACES) learned about central dogma rate constants (CDRCs) or ratios across the library, we developed a metric of fitness computed for each CDRC parameter

  • Our results suggest that experimentally determining dm and dp will greatly improve the estimation of the remaining CDRCs from experimentally measured protein distributions

Read more

Summary

Introduction

Stochasticity in transcription and translation produces fluctuations in both RNA [1,2,3,4], and protein [5,6,7,8,9,10,11,12,13,14,15,16]. The result is a protein count distribution that reflects cell-to-cell variation in gene expression To simulate this cell-to-cell variation in silico, investigators developed a stochastic model of gene expression (Fig. 1), which has proven to be an effective abstraction of the central dogma [1,3,5,8,10,14,20,22]. This model is parameterized by six central dogma rate constants (CDRCs) that govern a gene’s ON (ton) and OFF (toff ) transitions, transcription from the active state (km), translation of RNAs (kp), and degradation of RNA (dm) and protein (dp). With a specific set of CDRCs the gene expression model depicted in Fig. 1 can be simulated with the Gillespie algorithm [23] to produce the corresponding protein count distribution [18,24,25,26,27,28,29]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.