Prediction of RNA structures containing pseudoknots

Dongkyu Lee,Kyungsook Han

doi:10.4051/ibce.2009.1.0005

Abstract

This paper describes a genetic algorithm for predicting RNA structures that contain various types of pseudoknots. Pseudoknotted RNA structures are much more difficult to predict by computational methods than RNA secondary structures, as they are more complex and the analysis is time-consuming. We developed an efficient genetic algorithm to predict RNA folding structures containing any type of pseudoknot, as well as a novel initial population method to decrease computational complexity and increase the accuracy of the results. We also used an interaction filter to decrease the size of the possible stem lists for long RNA sequences. We predicted RNA structures using a number of different termination conditions and compared the validity of the results and the times required for the analyses. The algorithm proved able to predict efficiently RNA structures containing various types of pseudoknots. Corresponding Author: Kyungsook Han (Email: khan@inha.ac.kr) This work was supported by the Korea Science and Engineering Foundation (KOSEF) under grant R01-2003000-10461-0. Introduction The prediction of an RNA structure with a pseudoknot using computational methods requires much computation. Predicting the most stable structure with minimal free energy from an RNA sequence is an optimization problem (Lee and Han, 2002; Lee and Han, 2003; Deiman and Pleij, 1997). Computational methods for predicting RNA structure generally make use of two algorithms, one combinatorial the other recursive. The combinatorial algorithm first creates an inventory of all possible stem lists that can be formed by a given RNA sequence, and then determines the combination with the lowest free energy. This algorithm has the advantage that it can include pseudoknot structures, but the number of possible structures increases immensely with sequence length (Rivas and Eddy, 1999; Akutsu, 2000). The recursive algorithm finds the lowest free energy structure from the sub-fragments of a sequence. It makes a systematic search of all sub-fragments for the lowest free energy structure containing at least one base pair. The first sub-fragments considered are those capable of forming a hairpin loop closed by a single base pair. So in a first pass it will find the lowest free energy structures for all pentanucleotides in the sequence. This method always finds the structure with least free energy, but it does not identify structures such as pseudoknots because of their computational complexity. A genetic algorithm (GA) is an optimization procedure that implements the mechanism of biological evolution. It begins with a set of solutions called populations. Solutions are then taken and used to form a new population in the hope that the new population will be superior to the old one. They are selected to generate new solutions according to their fitness; the fitter they are, the more opportunities they have to reproduce. This procedure is repeated until some specified condition is satisfied. Genetic algorithms have been theoretically and empirically proven to provide robust searches in highly complex and uncertain spaces, and they are finding widespread application in commerce, science and engineering. They are computationally simple and powerful search methods, and many workers have used them to predict RNA structures and sequence alignments; they have been used to seek optimal and sub-optimal secondary RNA structures (Benedetti and Morosetti, 1995; Shapiro and Navetta, 1994) and to simulate RNA folding pathways (Gultyaev et al., 1995; Shapiro et al., 2001). Massively parallel genetic algorithms have been employed to predict RNA structures that include pseudoknots (Shapiro and Wu, 1996; Shapiro and Wu, 1997). However the structures predicted contained only H (Hairpin)-type pseudoknots and the computations were extremely complex as they used randomly generated initial populations. Dynamic programming algorithms also used to predict RNA structures including pseudoknots (Rivas and Eddy, 1999) again could only predict structures with H type pseudoknots, and only from short RNA sequences. We have developed a GA that is able to predict efficiently

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Prediction of RNA structures containing pseudoknots

Abstract

Talk to us

Similar Papers

More From: Interdisciplinary Bio Central

Lead the way for us

Similar Papers

A New Method of RNA Secondary Structure Prediction Based on Convolutional Neural Network and Dynamic Programming.
Hao Zhang ... Zhi Li
Frontiers in Genetics | VOL. 10
Hao Zhang, et. al.Hao Zhang ... Zhi Li
22 May 2019
Frontiers in Genetics | VOL. 10

Advances in RNA Structure Prediction from Sequence: New Tools for Generating Hypotheses about Viral RNA Structure-Function Relationships
Susan J Schroeder
Journal of Virology | VOL. 83
Susan J SchroederSusan J Schroeder
15 Apr 2009
Journal of Virology | VOL. 83

Cis regulatory effects on A-to-I RNA editing in related Drosophila species.
Anne L Sapiro ... Jin Billy Li
Cell Reports | VOL. 11
Anne L Sapiro, et. al.Anne L Sapiro ... Jin Billy Li
23 Apr 2015
Cell Reports | VOL. 11

RNAknot: A new algorithm for RNA secondary structure prediction based on genetic algorithm and GRASP method.
Abdelhakim El Fatmi ... Said Benhlima
Journal of bioinformatics and computational biology | VOL. 17
Abdelhakim El Fatmi, et. al.Abdelhakim El Fatmi ... Said Benhlima
01 Oct 2019
Journal of bioinformatics and computational biology | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Prediction of RNA structures containing pseudoknots

Abstract

Talk to us

Similar Papers

More From: Interdisciplinary Bio Central