Abstract

Bacterial small RNAs (sRNAs) are important post-transcriptional regulators of gene expression. The functional and evolutionary characterization of sRNAs requires the identification of homologs, which is frequently challenging due to their heterogeneity, short length and partly, little sequence conservation. We developed the GLobal Automatic Small RNA Search go (GLASSgo) algorithm to identify sRNA homologs in complex genomic databases starting from a single sequence. GLASSgo combines an iterative BLAST strategy with pairwise identity filtering and a graph-based clustering method that utilizes RNA secondary structure information. We tested the specificity, sensitivity and runtime of GLASSgo, BLAST and the combination RNAlien/cmsearch in a typical use case scenario on 40 bacterial sRNA families. The sensitivity of the tested methods was similar, while the specificity of GLASSgo and RNAlien/cmsearch was significantly higher than that of BLAST. GLASSgo was on average ∼87 times faster than RNAlien/cmsearch, and only ∼7.5 times slower than BLAST, which shows that GLASSgo optimizes the trade-off between speed and accuracy in the task of finding sRNA homologs. GLASSgo is fully automated, whereas BLAST often recovers only parts of homologs and RNAlien/cmsearch requires extensive additional bioinformatic work to get a comprehensive set of homologs. GLASSgo is available as an easy-to-use web server to find homologous sRNAs in large databases.

Highlights

  • Small regulatory RNAs are important regulators of gene expression in bacteria (Wagner and Romby, 2015)

  • We compared GLobal Automatic Small RNA Search go (GLASSgo) with on a well-established web server that can be used by two existing homolog prediction approaches, BLAST and the non-experts to find homologous sequences of all RNAlien/cmsearch combination

  • Desired, the sensitivity of GLASSgo can be enhanced without affecting the specificity by re-running the tool with a True positives (TP) that has a relatively low pairwise identity to the query (

Read more

Summary

Introduction

Small regulatory RNAs (sRNAs) are important regulators of gene expression in bacteria (Wagner and Romby, 2015). Comparative computational tools for the prediction of sRNA targets (Wright et al, 2013, 2014), for the calculation of a potentially conserved secondary structure (Bernhart et al, 2008; Katoh and Toh, 2008; Smith et al, 2010) and for the prediction of small open reading frames of potential μ-proteins or dual function sRNAs (Washietl et al, 2011) require the input of multiple members of an sRNA family. In high-throughput dRNA-seq/RNA-seq experiments, the comparative approach can be used to predict potential sRNAs from scratch (Lott et al, 2017)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call