Abstract

Riboswitches are cis-regulatory genetic elements that use an aptamer to control gene expression. Specificity to cognate ligand and diversity of such ligands have expanded the functional repertoire of riboswitches to mediate mounting apt responses to sudden metabolic demands and signal changes in environmental conditions. Given their critical role in microbial life, riboswitch characterisation remains a challenging computational problem. Here we have addressed the issue with advanced deep learning frameworks, namely convolutional neural networks (CNN), and bidirectional recurrent neural networks (RNN) with Long Short-Term Memory (LSTM). Using a comprehensive dataset of 32 ligand classes and a stratified train-validate-test approach, we demonstrated the accurate performance of both the deep learning models (CNN and RNN) relative to conventional hyperparameter-optimized machine learning classifiers on all key performance metrics, including the ROC curve analysis. In particular, the bidirectional LSTM RNN emerged as the best-performing learning method for identifying the ligand-specificity of riboswitches with an accuracy >0.99 and macro-averaged F-score of 0.96. An additional attraction is that the deep learning models do not require prior feature engineering. A dynamic update functionality is built into the models to factor for the constant discovery of new riboswitches, and extend the predictive modeling to new classes. Our work would enable the design of genetic circuits with custom-tuned riboswitch aptamers that would effect precise translational control in synthetic biology. The associated software is available as an open-source Python package and standalone resource for use in genome annotation, synthetic biology, and biotechnology workflows.

Highlights

  • Riboswitches are ubiquitous and critical metabolite-sensing gene expression regulators in bacteria that are capable of folding into at least two alternative conformations of 5 UTR mRNA secondary structure, which functionally switch gene expression between on and off states (Mandal et al, 2003; Roth and Breaker, 2009; Serganov and Nudler, 2013)

  • Using a comprehensive dataset of 32 ligand classes and a stratified train-validate-test approach, we demonstrated the accurate performance of both the deep learning models (CNN and recurrent neural networks (RNN)) relative to conventional hyperparameter-optimized machine learning classifiers on all key performance metrics, including the receiver operating characteristic (ROC) curve analysis

  • The best-performing among the base models was the Multi-layer Perceptron

Read more

Summary

Introduction

Riboswitches are ubiquitous and critical metabolite-sensing gene expression regulators in bacteria that are capable of folding into at least two alternative conformations of 5 UTR mRNA secondary structure, which functionally switch gene expression between on and off states (Mandal et al, 2003; Roth and Breaker, 2009; Serganov and Nudler, 2013). Riboswitches have been directly used as posttranscriptional and translational checkpoints in genetic circuits (Chang et al, 2012) Their key functional roles in infectious agents but absence in host genomes make them attractive targets for the design of cognate inhibitors (Blount and Breaker, 2006; Deigan and Ferré-D’Amaré, 2011; Wang et al, 2017). Characterisation of riboswitches would expand the repertoire of translational control options in synthetic biology and bioengineering. This would facilitate the reliable construction of precise genetic circuits. In view of their myriad applications, robust computational methods for the accurate characterisation of novel riboswitch sequences would be of great value

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call