Abstract

BackgroundStructural alignment of RNAs is becoming important, since the discovery of functional non-coding RNAs (ncRNAs). Recent studies, mainly based on various approximations of the Sankoff algorithm, have resulted in considerable improvement in the accuracy of pairwise structural alignment. In contrast, for the cases with more than two sequences, the practical merit of structural alignment remains unclear as compared to traditional sequence-based methods, although the importance of multiple structural alignment is widely recognized.ResultsWe took a different approach from a straightforward extension of the Sankoff algorithm to the multiple alignments from the viewpoints of accuracy and time complexity. As a new option of the MAFFT alignment program, we developed a multiple RNA alignment framework, X-INS-i, which builds a multiple alignment with an iterative method incorporating structural information through two components: (1) pairwise structural alignments by an external pairwise alignment method such as SCARNA or LaRA and (2) a new objective function, Four-way Consistency, derived from the base-pairing probability of every sub-aligned group at every multiple alignment stage.ConclusionThe BRAliBASE benchmark showed that X-INS-i outperforms other methods currently available in the sum-of-pairs score (SPS) criterion. As a basis for predicting common secondary structure, the accuracy of the present method is comparable to or rather higher than those of the current leading methods such as RNA Sampler. The X-INS-i framework can be used for building a multiple RNA alignment from any combination of algorithms for pairwise RNA alignment and base-pairing probability. The source code is available at the webpage found in the Availability and requirements section.

Highlights

  • Structural alignment of RNAs is becoming important, since the discovery of functional non-coding RNAs

  • In order to clarify the effect of each of the two components, we evaluated the accuracies of XINS-i variants with and without the two components

  • The current version of X-INS-i-scarnapair is faster than RNA Sampler and Murlet, but slower than MXSCARNA

Read more

Summary

Introduction

Structural alignment of RNAs is becoming important, since the discovery of functional non-coding RNAs (ncRNAs). Multiple alignment is an important step in various phases of comparative studies of RNAs, such as the detection of common secondary structures from a set of homologous sequences and the preparation of an alignment as a query for database search tools including Infernal [1]. The Sankoff algorithm, which simultaneously performs sequence alignment and secondary structure prediction, is available for plan B. This algorithm is not applicable to real analyses with more than two sequences due to its time complexity, O(L3N), where L is the sequence length and N is the number of sequences. Even if a virtually optimum pairwise structural alignment were successfully obtained by using variants of the Sankoff algorithm or by other algorithms, handling multiple sequences would remain a nontrivial task. For predicting common secondary structure in multiple unaligned sequences, some sort of heuristics will inevitably be required, since an exact application of plan B is impossible

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call