An upper bound on the hardness of exact matrix based motif discovery

Paul Horton,Wataru Fujibuchi

doi:10.1016/j.jda.2006.10.006

Abstract

Motif discovery is the problem of finding local patterns or motifs from a set of unlabeled sequences. One common representation of a motif is a Markov model known as a score matrix. Matrix based motif discovery has been extensively studied but no positive results have been known regarding its theoretical hardness. We present the first non-trivial upper bound on the complexity (worst-case computation time) of this problem. Other than linear terms, our bound depends only on the motif width w (which is typically 5–20) and is a dramatic improvement relative to previously known bounds. We prove this bound by relating the motif discovery problem to a search problem over permutations of strings of length w, in which the permutations have a particular property. We give a constructive proof of an upper bound on the number of such permutations. For an alphabet size of σ (typically 4) the trivial bound is n ! ≈ ( n e ) n , n = σ w . Our bound is roughly n ( σ log σ n ) n . We relate this theoretical result to the exact motif discovery program, TsukubaBB, whose algorithm contains ideas which inspired the result. We describe a recent improvement to the TsukubaBB program which can give a speed up of nine or more and use a dataset of REB1 transcription factor binding sites to illustrate that exact methods can indeed be used in some practical situations.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Discrete Algorithms	Publication Date: Dec 18, 2006
Citations: 2	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

An upper bound on the hardness of exact matrix based motif discovery

Abstract

Talk to us

Similar Papers

More From: Journal of Discrete Algorithms

Lead the way for us

Similar Papers

An Upper Bound on the Hardness of Exact Matrix Based Motif Discovery
Paul Horton ... Wataru Fujibuchi
-
Paul Horton, et. al.Paul Horton ... Wataru Fujibuchi
01 Jan 2004
01 Jan 2004

MCOIN: a novel heuristic for determining TFBS motif width
...
-
, et. al. ...
18 Jun 2013
18 Jun 2013

Automated incorporation of pairwise dependency in transcription factor binding site prediction using dinucleotide weight tensors.
Saeed Omidi ... Mihaela Zavolan
PLOS Computational Biology | VOL. 13
Saeed Omidi, et. al.Saeed Omidi ... Mihaela Zavolan
28 Jul 2017
PLOS Computational Biology | VOL. 13

A profile-based deterministic sequential Monte Carlo algorithm for motif discovery
Kuo-Ching Liang ... Dimitris Anastassiou
Bioinformatics | VOL. 24
Kuo-Ching Liang, et. al.Kuo-Ching Liang ... Dimitris Anastassiou
17 Nov 2007
Bioinformatics | VOL. 24

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An upper bound on the hardness of exact matrix based motif discovery

Abstract

Talk to us

Similar Papers

More From: Journal of Discrete Algorithms