Abstract

RNA secondary structure prediction is widely used to understand RNA function. Existing dynamic programming-based algorithms, both the classical minimum free energy (MFE) methods and partition function methods, suffer from a major limitation: their runtimes scale cubically with the RNA length, and this slowness limits their use in genome-wide applications. Inspired by incremental parsing for context-free grammars in computational linguistics, we designed linear-time heuristic algorithms, LinearFold and LinearPartition, to approximate the MFE structure, partition function and base pairing probabilities. These programs are orders of magnitude faster than Vienna RNAfold and CONTRAfold on long sequences. More interestingly, LinearFold and LinearPartition lead to more accurate predictions on the longest sequence families for which the structures are well established (16S and 23S Ribosomal RNAs), as well as improved accuracies for long-range base pairs (500 + nucleotides apart). This chapter provides protocols for using LinearFold and LinearPartition for secondary structure prediction.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call