Abstract

While bacterial operons have been thoroughly studied, few analyses of chloroplast operons exist, limiting the ability to study fundamental elements of these structures and utilize them for synthetic biology. Here, we describe the creation of a plastome-specific operon database (link provided below) achieved by combining experimental tools and predictive modeling. Using a Reverse-Transcription-PCR based method and published data, we determined the transcription-state of 213 gene pairs from four plastomes of evolutionary distinct organisms. By analyzing sequence-based features computed for our dataset, we were able to highlight fundamental characteristics differentiating between operon pairs and non-operon pairs. These include an interesting tendency toward maintaining similar messenger RNA-folding profiles in operon gene pairs, a feature that failed to yield any informative separation in cyanobacteria, suggesting that it catches unique traits of operon gene expression, which have evolved post-endosymbiosis. Subsequently, we used this feature set to train a random-forest classifier for operon prediction. As our results demonstrate the ability of our predictor to obtain accurate (84%) and robust predictions on unlabeled datasets, we proceeded to building operon maps for 2018 sequenced plastids. Our database may now present new opportunities for promoting metabolic engineering and synthetic biology in chloroplasts.

Highlights

  • Plastids are cellular organelles mainly found in a diverse group of photosynthetic organisms [1]

  • Specific primers were designed for each chosen gene pair; the forward annealed to the 5 gene, whereas the reverse primer annealed to the 3 gene (Figure 1B-1)

  • To create a generalist dataset of plastid operons, we began from obtaining empirical operon data from four chloroplast genomes: H. vulgare, C. reinhardtii, C. merolae and P. tricornutum (Heterokont) (Figure 1A) [61]

Read more

Summary

Introduction

Plastids are cellular organelles mainly found in a diverse group of photosynthetic organisms [1]. Several recent studies have attempted to predict bacterial operons by utilizing supervised machine-learning algorithms trained on experimental data [11,12,13,14,15,16]. These computational methods typically rely on features such as intergenic distances between adjacent genes [16], conservation of gene order [17,18], functional classifications [19,20] and differential RNA levels [12,13]. Bacterial operons are relatively well-defined [21,22,23,24] and can be found in several online databases [15,25,26,27,28]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.