Fast searching in packed strings

Philip Bille

doi:10.1016/j.jda.2010.09.003

Philip Bille

Open Access

https://doi.org/10.1016/j.jda.2010.09.003

Copy DOI

Journal: Journal of Discrete Algorithms	Publication Date: Sep 16, 2010
Citations: 13	License type: unspecified-oa

Affiliation: Technical University of Denmark

Abstract

Given strings P and Q the (exact) string matching problem is to find all positions of substrings in Q matching P. The classical Knuth–Morris–Pratt algorithm [SIAM J. Comput. 6 (2) (1977) 323–350] solves the string matching problem in linear time which is optimal if we can only read one character at the time. However, most strings are stored in a computer in a packed representation with several characters in a single word, giving us the opportunity to read multiple characters simultaneously. In this paper we study the worst-case complexity of string matching on strings given in packed representation. Let m ⩽ n be the lengths P and Q, respectively, and let σ denote the size of the alphabet. On a standard unit-cost word-RAM with logarithmic word size we present an algorithm using time O ( n log σ n + m + occ ) . Here occ is the number of occurrences of P in Q. For m = o ( n ) this improves the O ( n ) bound of the Knuth–Morris–Pratt algorithm. Furthermore, if m = O ( n / log σ n ) our algorithm is optimal since any algorithm must spend at least Ω ( ( n + m ) log σ log n + occ ) = Ω ( n log σ n + occ ) time to read the input and report all occurrences. The result is obtained by a novel automaton construction based on the Knuth–Morris–Pratt algorithm combined with a new compact representation of subautomata allowing an optimal tabulation-based simulation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Fast searching in packed strings

Abstract

Talk to us

Similar Papers

More From: Journal of Discrete Algorithms

Lead the way for us

Similar Papers

Fast Searching in Packed Strings
Philip Bille
-
Philip BillePhilip Bille
01 Jan 2009
01 Jan 2009

Comparison of Knuth Morris Pratt and Boyer Moore algorithms for a web-based dictionary of computer terms
Ali Khumaidi ... Yusuf Aras Ronisah
Jurnal Informatika | VOL. 14
Ali Khumaidi, et. al.Ali Khumaidi ... Yusuf Aras Ronisah
01 Jan 2020
Jurnal Informatika | VOL. 14

High performance parallel KMP algorithm on a heterogeneous architecture
Neungsoo Park ... Myungho Lee
Cluster Computing | VOL. 23
Neungsoo Park, et. al.Neungsoo Park ... Myungho Lee
22 Aug 2019
Cluster Computing | VOL. 23

A new string matching algorithm based on logical indexing
Daniar Heri Kurniawan ... Rinaldi Munir
-
Daniar Heri Kurniawan, et. al.Daniar Heri Kurniawan ... Rinaldi Munir
01 Aug 2015
01 Aug 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Fast searching in packed strings

Abstract

Talk to us

Similar Papers

More From: Journal of Discrete Algorithms