Abstract
Developing predictive models of multi-protein genetic systems to understand and optimize their behavior remains a combinatorial challenge, particularly when measurement throughput is limited. We developed a computational approach to build predictive models and identify optimal sequences and expression levels, while circumventing combinatorial explosion. Maximally informative genetic system variants were first designed by the RBS Library Calculator, an algorithm to design sequences for efficiently searching a multi-protein expression space across a > 10,000-fold range with tailored search parameters and well-predicted translation rates. We validated the algorithm's predictions by characterizing 646 genetic system variants, encoded in plasmids and genomes, expressed in six gram-positive and gram-negative bacterial hosts. We then combined the search algorithm with system-level kinetic modeling, requiring the construction and characterization of 73 variants to build a sequence-expression-activity map (SEAMAP) for a biosynthesis pathway. Using model predictions, we designed and characterized 47 additional pathway variants to navigate its activity space, find optimal expression regions with desired activity response curves, and relieve rate-limiting steps in metabolism. Creating sequence-expression-activity maps accelerates the optimization of many protein systems and allows previous measurements to quantitatively inform future designs.
Highlights
A microbe’s ability to sense its environment, process signals, perform decision making, and manufacture chemical products is controlled by its DNA sequence (Yim et al, 2011; Du et al, 2012, 2013; Moon et al, 2012; Sandoval et al, 2012; Santos et al, 2012; Tseng & Prather, 2012; Lee et al, 2013; Xu et al, 2013; Zhao et al, 2013)
We developed an automated search algorithm, called the RBS Library Calculator, to design the smallest synthetic ribosome-binding site (RBS) library that systematically increases a protein’s expression level across a selected range on a > 10,000-fold proportional scale
By accounting for known differences, we investigated whether the biophysical model can accurately predict translation rates in diverse bacterial hosts
Summary
A microbe’s ability to sense its environment, process signals, perform decision making, and manufacture chemical products is controlled by its DNA sequence (Yim et al, 2011; Du et al, 2012, 2013; Moon et al, 2012; Sandoval et al, 2012; Santos et al, 2012; Tseng & Prather, 2012; Lee et al, 2013; Xu et al, 2013; Zhao et al, 2013). Finding a quantitative relationship between DNA sequence and host behavior has been a central goal toward understanding evolution and adaptation, treating human disease, and the engineering of organisms for biotechnology applications (Strohman, 2002; Wessely et al, 2011; O’Brien et al, 2013; de Vos et al, 2013; Quandt et al, 2014). Recent advances in DNA synthesis, assembly, and mutagenesis have greatly accelerated the construction and modification of large synthetic genetic systems. The development of multiplex genome engineering provides the ability to simultaneously introduce DNA mutations into several genomic loci (Wang et al, 2009, 2012; Esvelt & Wang, 2013). Techniques are readily available to construct or modify large genetic systems of interest, we currently cannot predict the DNA sequences that will achieve an optimal behavior, when the actions of multiple proteins are responsible for a system’s function
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.