Abstract
Ribozyme switches are a class of RNA-encoded genetic switch that support conditional regulation of gene expression across diverse organisms. An improved elucidation of the relationships between sequence, structure, and activity can improve our capacity for de novo rational design of ribozyme switches. Here, we generated data on the activity of hundreds of thousands of ribozyme sequences. Using automated structural analysis and machine learning, we leveraged these large data sets to develop predictive models that estimate the in vivo gene-regulatory activity of a ribozyme sequence. These models supported the de novo design of ribozyme libraries with low mean basal gene-regulatory activities and new ribozyme switches that exhibit changes in gene-regulatory activity in the presence of a target ligand, producing functional switches for four out of five aptamers. Our work examines how biases in the model and the data set that affect prediction accuracy can arise and demonstrates that machine learning can be applied to RNA sequences to predict gene-regulatory activity, providing the basis for design tools for functional RNAs.
Highlights
The genetic engineering of novel biological systems has the ability to produce solutions to a wide array of global challenges [1]
We developed a training data set for the model by constructing a library of 83 150,000 unique ribozymes and used a high-throughput fluorescence-activated cell sorting (FACS)-Seq screening method to measure the gene84 regulatory activities of individual library sequences
Libraries were cloned into the 3’ untranslated region (UTR) of a GFP expression cassette encoded on a low-copy plasmid, such that ribozymes with high cleavage activities result in cells expressing low GFP levels
Summary
The genetic engineering of novel biological systems has the ability to produce solutions to a wide array of global challenges [1]. While other software tools exist for designing RNA switches, ours is the first that works using the ribozyme platform This is important because this platform has been demonstrated across different cell systems, and because, to advance our ability to design biological systems, having orthogonal systems of gene regulation allows for larger constructs to be built without genetic overlap, avoiding recombination-based dropout. We used the resulting model to design a set of novel ribozyme switches using 5 different aptamers that alter gene expression upon a change in cognate ligand concentration in yeast cells. for which aptamer sequences are available, advancing the field of synthetic biology by enabling the computational design of dynamic biological systems
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have