Excess utilization of translational resources is a critical source of burden on cells engineered to over-express exogenous proteins. To improve protein yields and genetic stability, researchers often use codon optimization strategies that improve translational efficiency by matching an exogenous gene's codon usage with that of the host organism's highly expressed genes. Despite empirical data that shows the benefits of codon optimization, little is known quantitatively about the relationship between codon usage bias and the burden imposed by protein overexpression. Here, we develop and experimentally evaluate a stochastic gene expression model that considers the impact of codon usage bias on the availability of ribosomes and different tRNAs in a cell. In agreement with other studies, our model shows that increasing exogenous protein expression decreases production of native cellular proteins in a linear fashion. We also find that the slope of this relationship is modulated by how well the codon usage bias of the exogenous gene and the host's genes match. Strikingly, we predict that an overoptimization domain exists where further increasing usage of optimal codons worsens yield and burden. We test our model by expressing sfGFP and mCherry2 from constructs that have a wide range of codon optimization levels in Escherichia coli . The results agree with our model, including for an mCherry2 gene sequence that appears to lose expression and genetic stability from codon overoptimization. Our findings can be leveraged by researchers to predict and design more optimal cellular systems through the use of more nuanced codon optimization strategies.
Read full abstract