RNAlign program: alignment of RNA sequences using both primary and secondary structures.

Florence Corpet,Bernard Michot

doi:10.1093/bioinformatics/10.4.389

Abstract

We have developed an algorithm and a computer program for aligning new RNA sequences with a bank of aligned homologous RNA sequences. Given a common folding structure for the bank, the program performs an alignment between the bank and a new sequence, optimal both in terms of primary and secondary structure. This method is useful to align sequences that present a common folding structure despite extensive divergence of their primary structures. It allows these preserved regions to be precisely distinguished from domains with more variable secondary structure. An optimal alignment of a sequence of length N with a bank of homologous sequences of length M is produced in O (M2N3) time and O(M2N2) space. For sequences that are too long for an algorithm of this complexity, a proposed strategy is to use a classical alignment (using only primary structure data) then improve it with the new algorithm in the regions where the bank stems are not aligned with possible stems in the new sequence. The algorithm has been implemented in Turbo Pascal on a PC, and has been used to align RNA sequences of eubacterial large ribosomal subunit.

Full Text