Small molecules targeting specific RNA binding sites, including stable and transient RNA structures, are emerging as effective pharmacological approaches for modulating gene expression. However, little is understood about how stable RNA secondary structures are shared across organisms, an important factor in controlling drug selectivity. In this study, I provide an analytical pipeline named RNA Secondary Structure Finder (R2S-Finder) to discover short, stable RNA structural motifs for humans, Escherichia coli ( E. coli ), SARS-CoV-2, and Zika virus by leveraging existing in vivo and in vitro genome-wide chemical RNA-probing datasets. I found several common features across organisms. For example, apart from the well-documented tetraloops, AU-rich tetraloops are widely present in different organisms. I also found that the 5' untranslated region (UTR) contains a higher proportion of stable structures than the coding sequences in humans, SARS-CoV-2, and Zika virus. In general, stable structures predicted from in vitro (protein-free) and in vivo datasets are consistent in humans, E. coli , and SARS-CoV-2, indicating that most stable structure formation were driven by RNA folding alone, while a larger variation was found between in vitro and in vivo data with certain RNA types, such as human long intergenic non-coding RNAs (lincRNAs). Finally, I predicted stable three- and four-way RNA junctions that exist both in vivo and in vitro conditions, which can potentially serve as drug targets. All results of stable sequences, stem-loops, internal loops, bulges, and three- and four-way junctions have been collated in the R2S-Finder database ( https://github.com/JingxinWangLab/R2S-Finder ), which is coded in hyperlinked HTML pages and tabulated in CSV files.
Read full abstract