Abstract

We introduce efficient indexes for a problem in non-standard stringology: jumbled pattern matching. An index is a data structure constructed for a text of length n over an alphabet of size sigma that can answer queries asking if the text contains a fragment which is jumbled (Abelian) equivalent to a pattern, specified by its so-called Parikh vector. We denote the length of the pattern by m. Moosa and Rahman (J Discrete Algorithms 10:5–9, 2012) gave an index for the case of binary alphabets with mathcal {O}left( frac{n^2}{(log n)^2}right) -time construction in the word-RAM model. Several earlier papers stated as an open problem the existence of an efficient solution for larger alphabets. In this paper we develop an index for any constant-sized alphabet. The construction involves a trade-off parameter, which in particular lets us achieve the following complexities: mathcal {O}(n^{2-delta }) space and mathcal {O}(m^{(2sigma -1)delta }) query time for any 0<delta <1, or mathcal {O}left( frac{n^2 (log log n)^2}{log n}right) space and polylogarithmic, o(log ^{2sigma -1} m), query time. The construction time in both cases is subquadratic: mathcal {O}left( frac{n^2 (log log n)^2}{log n}right) in the word-RAM model (using bit-parallelism). Our construction algorithms are randomized (Las Vegas, running time w.h.p.), which is due to the usage of perfect hashing. On the other hand, all queries are answered deterministically. A preliminary version of this work appeared at ESA 2013 (Kociumaka et al. in Algorithms, ESA 2013. LNCS, vol 8125. Springer, Berlin, pp. 625–636, 2013). Here we improve it in several ways. We achieve mathcal {O}(n^2)-time construction of the index with mathcal {O}(n^{2-delta }) space and mathcal {O}(m^{(2sigma -1)delta }) query time, which was not present in the preliminary version. We also extend the index so that the position of the leftmost occurrence of the query pattern is provided at no additional cost in the complexity; this required rather nontrivial changes in the construction algorithm.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.