Abstract
In addition to encoding RNA primary structures, genomes also encode RNA secondary and tertiary structures that play roles in gene regulation and, in the case of RNA viruses, genome replication. Methods for the identification of functional RNA structures in genomes typically rely on scanning analysis windows, where multiple partially-overlapping windows are used to predict RNA structures and folding metrics to deduce regions likely to form functional structure. Separate structural models are produced for each window, where the step size can greatly affect the returned model. This makes deducing unique local structures challenging, as the same nucleotides in each window can be alternatively base paired. We are presenting here a new approach where all base pairs from analysis windows are considered and weighted by favorable folding. This results in unique base pairing throughout the genome and the generation of local regions/structures that can be ranked by their propensity to form unusually thermodynamically stable folds. We applied this approach to the Zika virus (ZIKV) and HIV-1 genomes. ZIKV is linked to a variety of neurological ailments including microcephaly and Guillain–Barré syndrome and its (+)-sense RNA genome encodes two, previously described, functionally essential structured RNA regions. HIV, the cause of AIDS, contains multiple functional RNA motifs in its genome, which have been extensively studied. Our approach is able to successfully identify and model the structures of known functional motifs in both viruses, while also finding additional regions likely to form functional structures. All data have been archived at the RNAStructuromeDB (www.structurome.bb.iastate.edu), a repository of RNA folding data for humans and their pathogens.
Highlights
In coordination with experimental techniques to determine genome-scale RNA secondary structure, computational methods are indispensable for identifying functional RNA structures
These algorithms function under the same principle: using the Turner nearest-neighbor energy parameters to predict the free energy (DG) yielded during the formation of the most stable, or minimum free energy (MFE) RNA secondary structure, which is assumes that the MFE structure is, or at least closely resembles, the native secondary structure
ScanFold-Fold predicted motifs in the Zika virus (ZIKV) genome The ZIKV genome was analyzed with ScanFold-Scan using a 120 nt window with a one nt step: resulting in 10,688 analyzed windows (Table S1)
Summary
In coordination with (or in the absence of) experimental techniques to determine genome-scale RNA secondary structure, computational methods are indispensable for identifying functional RNA structures. These algorithms (such as those found in programs like RNAfold; Lorenz et al, 2011, RNAstructure; Reuter & Mathews, 2010, and UNAfold; Markham & Zuker, 2008) function under the same principle: using the Turner nearest-neighbor energy parameters (empirically derived thermodynamic parameters; Mathews et al, 1999, 2004) to predict the free energy (DG) yielded during the formation of the most stable, or minimum free energy (MFE) RNA secondary structure, which is assumes that the MFE structure is, or at least closely resembles, the native secondary structure. Resulting MFE structure predictions have been shown to correctly predict ∼70% of base pairs in sequences
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.