Abstract

BackgroundSecondary structures form the scaffold of multiple sequence alignment of non-coding RNA (ncRNA) families. An accurate reconstruction of ancestral ncRNAs must use this structural signal. However, the inference of ancestors of a single ncRNA family with a single consensus structure may bias the results towards sequences with high affinity to this structure, which are far from the true ancestors.MethodsIn this paper, we introduce achARNement, a maximum parsimony approach that, given two alignments of homologous ncRNA families with consensus secondary structures and a phylogenetic tree, simultaneously calculates ancestral RNA sequences for these two families.ResultsWe test our methodology on simulated data sets, and show that achARNement outperforms classical maximum parsimony approaches in terms of accuracy, but also reduces by several orders of magnitude the number of candidate sequences. To conclude this study, we apply our algorithms on the Glm clan and the FinP-traJ clan from the Rfam database.ConclusionsOur results show that our methods reconstruct small sets of high-quality candidate ancestors with better agreement to the two target structures than with classical approaches. Our program is freely available at: http://csb.cs.mcgill.ca/acharnement.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-016-3105-4) contains supplementary material, which is available to authorized users.

Highlights

  • Secondary structures form the scaffold of multiple sequence alignment of non-coding RNA families

  • Input data For the algorithms presented in this paper, we assume that we have two non-coding RNA families that have been identified as a clan [26]

  • Algorithms We propose a new tool, achARNement, composed of two exact algorithms (CalculateScores-1struct and CalculateScores-2structs) based on the Fitch [21] and Sankoff [22] parsimony methods for the inference of ancestral sequences in a phylogeny

Read more

Summary

Introduction

Secondary structures form the scaffold of multiple sequence alignment of non-coding RNA (ncRNA) families. The inference of ancestors of a single ncRNA family with a single consensus structure may bias the results towards sequences with high affinity to this structure, which are far from the true ancestors. Most of the attention has been given to the reconstruction of ancient protein and DNA sequences, while RNA molecules remained relatively overlooked. The reconstruction of non-coding RNA (ncRNA) sequences is challenging. NcRNA functions are typically carried out by specific molecular structures, and sequences are generally less conserved than structures [12]. This implies that dedicated frameworks must be developed to capture this structural information

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call