The RNA Newton polytope and learnability of energy parameters

Elmirasadat Forouzmand,Hamidreza Chitsaz

doi:10.1093/bioinformatics/btt226

Abstract

Motivation: Computational RNA structure prediction is a mature important problem that has received a new wave of attention with the discovery of regulatory non-coding RNAs and the advent of high-throughput transcriptome sequencing. Despite nearly two score years of research on RNA secondary structure and RNA–RNA interaction prediction, the accuracy of the state-of-the-art algorithms are still far from satisfactory. So far, researchers have proposed increasingly complex energy models and improved parameter estimation methods, experimental and/or computational, in anticipation of endowing their methods with enough power to solve the problem. The output has disappointingly been only modest improvements, not matching the expectations. Even recent massively featured machine learning approaches were not able to break the barrier. Why is that?Approach: The first step toward high-accuracy structure prediction is to pick an energy model that is inherently capable of predicting each and every one of known structures to date. In this article, we introduce the notion of learnability of the parameters of an energy model as a measure of such an inherent capability. We say that the parameters of an energy model are learnable iff there exists at least one set of such parameters that renders every known RNA structure to date the minimum free energy structure. We derive a necessary condition for the learnability and give a dynamic programming algorithm to assess it. Our algorithm computes the convex hull of the feature vectors of all feasible structures in the ensemble of a given input sequence. Interestingly, that convex hull coincides with the Newton polytope of the partition function as a polynomial in energy parameters. To the best of our knowledge, this is the first approach toward computing the RNA Newton polytope and a systematic assessment of the inherent capabilities of an energy model. The worst case complexity of our algorithm is exponential in the number of features. However, dimensionality reduction techniques can provide approximate solutions to avoid the curse of dimensionality.Results: We demonstrated the application of our theory to a simple energy model consisting of a weighted count of A-U, C-G and G-U base pairs. Our results show that this simple energy model satisfies the necessary condition for more than half of the input unpseudoknotted sequence–structure pairs (55%) chosen from the RNA STRAND v2.0 database and severely violates the condition for ∼13%, which provide a set of hard cases that require further investigation. From 1350 RNA strands, the observed 3D feature vector for 749 strands is on the surface of the computed polytope. For 289 RNA strands, the observed feature vector is not on the boundary of the polytope but its distance from the boundary is not more than one. A distance of one essentially means one base pair difference between the observed structure and the closest point on the boundary of the polytope, which need not be the feature vector of a structure. For 171 sequences, this distance is larger than two, and for only 11 sequences, this distance is larger than five.Availability: The source code is available on http://compbio.cs.wayne.edu/software/rna-newton-polytope.Contact: chitsaz@wayne.edu

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Bioinformatics	Publication Date: Jun 19, 2013
Citations: 43	License type: CC BY-NC 3.0

R Discovery Prime

R Discovery Prime

The RNA Newton polytope and learnability of energy parameters

Abstract

Talk to us

Similar Papers

More From: Bioinformatics

Lead the way for us

Similar Papers

A Hitchhiker's guide to RNA-RNA structure and interaction prediction tools.
Francis Yew Fu Tieng ... Zeti-Azura Mohamed-Hussein
Briefings in Bioinformatics | VOL. 25
Francis Yew Fu Tieng, et. al.Francis Yew Fu Tieng ... Zeti-Azura Mohamed-Hussein
22 Nov 2023
Briefings in Bioinformatics | VOL. 25

RNA secondary structure prediction with pseudoknots: Contribution of algorithm versus energy model.
Hosna Jabbari ... M Sohel Rahman
PloS one | VOL. 13
Hosna Jabbari, et. al.Hosna Jabbari ... M Sohel Rahman
05 Apr 2018
PloS one | VOL. 13

RNA STRAND: The RNA Secondary Structure and Statistical Analysis Database
Mirela Andronescu ... Holger H Hoos
BMC Bioinformatics | VOL. 9
Mirela Andronescu, et. al.Mirela Andronescu ... Holger H Hoos
13 Aug 2008
BMC Bioinformatics | VOL. 9

RNA Secondary Structure Prediction with Pseudoknots Using Chemical Reaction Optimization Algorithm
Md Rafiqul Islam ... Nazmus Sakeef
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 18
Md Rafiqul Islam, et. al.Md Rafiqul Islam ... Nazmus Sakeef
01 May 2021
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 18

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The RNA Newton polytope and learnability of energy parameters

Abstract

Talk to us

Similar Papers

More From: Bioinformatics