Computational Recognition of RNA Splice Sites by Exact Algorithms for the Quadratic Traveling Salesman Problem

Anja Fischer,Frank Fischer,Ivo Grosse,Jens Keilwagen,Paul Molitor,Gerold Jäger

doi:10.3390/computation3020285

Abstract

One fundamental problem of bioinformatics is the computational recognition of DNA and RNA binding sites. Given a set of short DNA or RNA sequences of equal length such as transcription factor binding sites or RNA splice sites, the task is to learn a pattern from this set that allows the recognition of similar sites in another set of DNA or RNA sequences. Permuted Markov (PM) models and permuted variable length Markov (PVLM) models are two powerful models for this task, but the problem of finding an optimal PM model or PVLM model is NP-hard. While the problem of finding an optimal PM model or PVLM model of order one is equivalent to the traveling salesman problem (TSP), the problem of finding an optimal PM model or PVLM model of order two is equivalent to the quadratic TSP (QTSP). Several exact algorithms exist for solving the QTSP, but it is unclear if these algorithms are capable of solving QTSP instances resulting from RNA splice sites of at least 150 base pairs in a reasonable time frame. Here, we investigate the performance of three exact algorithms for solving the QTSP for ten datasets of splice acceptor sites and splice donor sites of five different species and find that one of these algorithms is capable of solving QTSP instances of up to 200 base pairs with a running time of less than two days.

Highlights

Gene regulation in higher organisms is accomplished at several levels such as transcriptional regulation and post-transcriptional regulation by several cellular processes such as transcription initiation and RNA splicing
The computational recognition of RNA splice sites is an important task in bioinformatics, and two popular models for this task are permuted Markov models and permuted variable length Markov models
Learning permuted Markov models and permuted variable length Markov models is NP-hard and, a challenging problem for the recognition of RNA splice sites, because it could be shown that sequences of at least 150 bp surrounding the splice sites should be taken into account for a reliable recognition of RNA splice sites

Summary

Introduction

Gene regulation in higher organisms is accomplished at several levels such as transcriptional regulation and post-transcriptional regulation by several cellular processes such as transcription initiation and RNA splicing. Many approaches for the computational recognition of transcription factor binding sites or RNA splice sites rely on statistical models, and two popular models for this task are permuted Markov (PM). It would be desirable to develop exact algorithms capable of learning PM models and PVLM models for RNA splice sites of at least 150 bp in practically acceptable running times. For PM models and PVLM models of order one, the task of learning the maximum likelihood model results in the traditional Hamiltonian path problem (HPP) or the related traditional traveling salesman problem (TSP). For more powerful PM models and PVLM models of order two, the task of learning the maximum likelihood model results in the quadratic Hamiltonian path problem (QHPP) or the related quadratic traveling salesman problem (QTSP), which are extensions of the traditional linear HPP and TSP, respectively.

Permuted Markov Models and Permuted Variable Length Markov Models

Quadratic Traveling Salesman Problem

Exact Algorithms

Dynamic Programming Algorithm

Branch-and-Bound Algorithm

Branch-and-Cut Algorithm

Experimental Study

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computation	Publication Date: Jun 3, 2015
Citations: 21	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Computational Recognition of RNA Splice Sites by Exact Algorithms for the Quadratic Traveling Salesman Problem

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computation

Lead the way for us

Similar Papers

Exact algorithms and heuristics for the Quadratic Traveling Salesman Problem with an application in bioinformatics
A Fischer ... I Grosse
Discrete Applied Mathematics | VOL. 166
A Fischer, et. al.A Fischer ... I Grosse
25 Nov 2013
Discrete Applied Mathematics | VOL. 166

TRNA Splicing
John Abelson ... Hong Li
Journal of Biological Chemistry | VOL. 273
John Abelson, et. al.John Abelson ... Hong Li
01 May 1998
Journal of Biological Chemistry | VOL. 273

Mechanism of non-spliceosomal mRNA splicing in the unfolded protein response pathway.
T N Gonzalez
The EMBO Journal | VOL. 18
T N GonzalezT N Gonzalez
01 Jun 1999
The EMBO Journal | VOL. 18

A Study of crossover operators for Genetic Algorithms to solve TSP
Poonam Poonam ... Proff Mrs Shakti Arora
IOSR Journal of Computer Engineering | VOL. 16
Poonam Poonam, et. al.Poonam Poonam ... Proff Mrs Shakti Arora
01 Jan 2014
IOSR Journal of Computer Engineering | VOL. 16

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Computational Recognition of RNA Splice Sites by Exact Algorithms for the Quadratic Traveling Salesman Problem

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computation