Space-efficient representation of truncated suffix trees, with applications to Markov order estimation

Luciana Vitale,Álvaro Martín,Gadiel Seroussi

doi:10.1016/j.tcs.2015.06.013

Abstract

Suffix trees (ST) are useful in many text processing applications, for example, to determine the number of occurrences of patterns of arbitrary length in an input string x. If the length n, of x, is large, the memory required to represent the ST may become a practical performance bottleneck. This problem can be alleviated, in cases where a nontrivial upper bound is known on the lengths of the patterns of interest, by using a truncated ST (TST). However, conventional TST implementations still require Ω(n) bits of memory, since they store x. We describe a new TST representation that avoids this limitation by storing all the information necessary to reconstruct the TST edge labels in a string y that is often much shorter than x. We apply TSTs to the implementation of Markov order estimators, where an upper bound kn on the estimated order can be derived or it is imposed (for consistency, for example). The new representation allows for estimator implementations with sublinear space complexity in some cases of interest. In other cases we show, experimentally, that even when the new representation does not have an asymptotic advantage, it still achieves very significant memory savings in practice.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Theoretical Computer Science	Publication Date: Jun 11, 2015
Citations: 6	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

Space-efficient representation of truncated suffix trees, with applications to Markov order estimation

Abstract

Published Version

Talk to us

Similar Papers

More From: Theoretical Computer Science

Lead the way for us

Similar Papers

Space-efficient representation of truncated suffix trees, with applications to Markov order estimation
Luciana Vitale ... Alvaro Martin
-
Luciana Vitale, et. al.Luciana Vitale ... Alvaro Martin
01 Jul 2013
01 Jul 2013

Truncated suffix trees and their application to data compression
Joong Chae Na ... Kunsoo Park
Theoretical Computer Science | VOL. 304
Joong Chae Na, et. al.Joong Chae Na ... Kunsoo Park
12 Apr 2003
Theoretical Computer Science | VOL. 304

Data compression with truncated suffix trees
Joong Chae Na ... Kunsoo Park
-
Joong Chae Na, et. al. Joong Chae Na ... Kunsoo Park
28 Mar 2000
28 Mar 2000

Construction of a de Bruijn Graph for Assembly from a Truncated Suffix Tree
Bastien Cazaux ... Thierry Lecroq
-
Bastien Cazaux, et. al.Bastien Cazaux ... Thierry Lecroq
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Space-efficient representation of truncated suffix trees, with applications to Markov order estimation

Abstract

Published Version

Talk to us

Similar Papers

More From: Theoretical Computer Science