A faster algorithm for approximate string matching

Ricardo Baeza-Yates,Gonzalo Navarro

doi:10.1007/3-540-61258-0_1

Abstract

We present a new algorithm for on-line approximate string matching. The algorithm is based on the simulation of a non-deterministic finite automaton built from the pattern and using the text as input. This simulation uses bit operations on a RAM machine with word length O(log n), being n the maximum size of the text. The running time achieved is O(n) for small patterns (i.e. of length m=O(√log n)), independently of the maximum number of errors allowed, k. This algorithm is then used to design two general algorithms. One of them partitions the problem into subproblems, while the other partitions the automaton into sub-automata. These algorithms are combined to obtain a hybrid algorithm which on average is O(n) for moderate k/m ratios, O(√mk/log n n) for medium ratios, and O((m−k)kn/log n) for large ratios. We show experimentally that this hybrid algorithm is faster than previous ones for moderate size of patterns and error ratios, which is the case in text searching.KeywordsDynamic ProgrammingHybrid AlgorithmError RatioProblem PartitioningSmall PatternThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A faster algorithm for approximate string matching

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Random access to grammar-compressed strings
...
-
, et. al. ...
23 Jan 2011
23 Jan 2011

Random Access to Grammar-Compressed Strings
Philip Bille ... Oren Weimann
-
Philip Bille, et. al.Philip Bille ... Oren Weimann
23 Jan 2011
23 Jan 2011

Extended approximate string matching algorithms to detect name aliases
Muniba Shaikh ... Uffe Kock Wiil
-
Muniba Shaikh, et. al.Muniba Shaikh ... Uffe Kock Wiil
01 Jul 2011
01 Jul 2011

Faster Approximate String Matching
R Baeza-Yates And G Navarro
Algorithmica | VOL. 23
R Baeza-Yates And G NavarroR Baeza-Yates And G Navarro
01 Feb 1999
Algorithmica | VOL. 23

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A faster algorithm for approximate string matching

Abstract

Talk to us

Similar Papers