Abstract

BackgroundSimulating the major molecular events inside an Escherichia coli cell can lead to a very large number of reactions that compose its overall behaviour. Not only should the model be accurate, but it is imperative for the experimenter to create an efficient model to obtain the results in a timely fashion. Here, we show that for many parameter regimes, the effect of the host cell genome on the transcription of a gene from a plasmid-borne promoter is negligible, allowing one to simulate the system more efficiently by removing the computational load associated with representing the presence of the rest of the genome. The key parameter is the on-rate of RNAP binding to the promoter (k_on), and we compare the total number of transcripts produced from a plasmid vector generated as a function of this rate constant, for two versions of our gene expression model, one incorporating the host cell genome and one excluding it. By sweeping parameters, we identify the k_on range for which the difference between the genome and no-genome models drops below 5%, over a wide range of doubling times, mRNA degradation rates, plasmid copy numbers, and gene lengths.ResultsWe assess the effect of the simulating the presence of the genome over a four-dimensional parameter space, considering: 24 min <= bacterial doubling time <= 100 min; 10 <= plasmid copy number <= 1000; 2 min <= mRNA half-life <= 14 min; and 10 bp <= gene length <= 10000 bp. A simple MATLAB user interface generates an interpolated k_on threshold for any point in this range; this rate can be compared to the ones used in other transcription studies to assess the need for including the genome.ConclusionExclusion of the genome is shown to yield less than 5% difference in transcript numbers over wide ranges of values, and computational speed is improved by two to 24 times by excluding explicit representation of the genome.

Highlights

  • Simulating the major molecular events inside an Escherichia coli cell can lead to a very large number of reactions that compose its overall behaviour

  • We implement a "mean-field" approach [10] by considering the production of generic transcripts with properties derived from genome-wide averages: we compute mean transcript lengths, mean elongation rates, and so on. With these quantities in hand, the unknown between models is reduced to the RNA polymerase on-rate constant for binding to the reporter promoter, and we find its value by sweeping until the difference in transcript average between models is 5%

  • The binding constant becomes large enough to produce a considerable quantity of transcripts; at this point the genome's presence competes with the reporter gene for access to RNA polymerase and reduces the transcription of the reporter gene, producing a significant percent difference between models

Read more

Summary

Introduction

Simulating the major molecular events inside an Escherichia coli cell can lead to a very large number of reactions that compose its overall behaviour. A central challenge in cellular modelling is to formulate correct biochemical reaction schemes to represent a process of interest, and to populate the reaction system with appropriate rate constants [5,6,7,8,9] Within this effort, two persistent difficulties arise: populating mathematical models based on incomplete experimental information [10,11]; and the computational demands of simulating the resulting systems, which can grow large for even moderately complex processes

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call