A probabilistic model for the evolution of RNA structure

Ian Holmes

doi:10.1186/1471-2105-5-166

Abstract

BackgroundFor the purposes of finding and aligning noncoding RNA gene- and cis-regulatory elements in multiple-genome datasets, it is useful to be able to derive multi-sequence stochastic grammars (and hence multiple alignment algorithms) systematically, starting from hypotheses about the various kinds of random mutation event and their rates.ResultsHere, we consider a highly simplified evolutionary model for RNA, called "The TKF91 Structure Tree" (following Thorne, Kishino and Felsenstein's 1991 model of sequence evolution with indels), which we have implemented for pairwise alignment as proof of principle for such an approach. The model, its strengths and its weaknesses are discussed with reference to four examples of functional ncRNA sequences: a riboswitch (guanine), a zipcode (nanos), a splicing factor (U4) and a ribozyme (RNase P). As shown by our visualisations of posterior probability matrices, the selected examples illustrate three different signatures of natural selection that are highly characteristic of ncRNA: (i) co-ordinated basepair substitutions, (ii) co-ordinated basepair indels and (iii) whole-stem indels.ConclusionsAlthough all three types of mutation "event" are built into our model, events of type (i) and (ii) are found to be better modeled than events of type (iii). Nevertheless, we hypothesise from the model's performance on pairwise alignments that it would form an adequate basis for a prototype multiple alignment and genefinding tool.

Highlights

For the purposes of finding and aligning noncoding RNA gene- and cis-regulatory elements in multiple-genome datasets, it is useful to be able to derive multi-sequence stochastic grammars systematically, starting from hypotheses about the various kinds of random mutation event and their rates
The three types of element considered by QRNA are noncoding RNA, protein-coding exons, and unidentified DNA homology
The pairwise aligner for the TKF91 Structure Tree is distributed as part of the DART package at the following URL: http://www.biowiki.org/

Summary

Introduction

For the purposes of finding and aligning noncoding RNA gene- and cis-regulatory elements in multiple-genome datasets, it is useful to be able to derive multi-sequence stochastic grammars (and multiple alignment algorithms) systematically, starting from hypotheses about the various kinds of random mutation event and their rates. A principled way to extract such signals is by fitting the data to probabilistic models of the molecular evolutionary process. (e.g. exons, bits of RNA, promoters, etc) that might explain an observed sequence homology. For each of these scenarios, we can construct a probabilistic model Mx, My, Mz... The model with the best fit indicates the type of functional element present in the sequence. A groundbreaking example of how this probabilistic approach can be used is the QRNA program, designed as a comparative RNA gene predictor [1]. The three types of element considered by QRNA are noncoding RNA (called RNA), protein-coding exons (called COD for codon), and unidentified DNA homology (called OTH for other).

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Jan 1, 2004
Citations: 83	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

A probabilistic model for the evolution of RNA structure

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

CSA: An efficient algorithm to improve circular DNA multiple alignment
Francisco Fernandes ... Luísa Pereira
BMC Bioinformatics | VOL. 10
Francisco Fernandes, et. al.Francisco Fernandes ... Luísa Pereira
23 Jul 2009
BMC Bioinformatics | VOL. 10

MUSTANG: A multiple structural alignment algorithm
Arun S Konagurthu ... Arthur M Lesk
Proteins: Structure, Function, and Bioinformatics | VOL. 64
Arun S Konagurthu, et. al.Arun S Konagurthu ... Arthur M Lesk
30 May 2006
Proteins: Structure, Function, and Bioinformatics | VOL. 64

ACache: Using Caching to Improve the Performance of Multiple Sequence Alignments
Xun Tu ... C.X Chen
-
Xun Tu, et. al. Xun Tu ... C.X Chen
03 Jul 2006
03 Jul 2006

Protein multiple sequence alignment by hybrid bio-inspired algorithms.
Vincenzo Cutello ... Giuseppe Nicosia
Nucleic acids research | VOL. 39
Vincenzo Cutello, et. al.Vincenzo Cutello ... Giuseppe Nicosia
10 Nov 2010
Nucleic acids research | VOL. 39

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A probabilistic model for the evolution of RNA structure

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics