Abstract

A few lines pattern matching algorithm is obtained by using the correctness proof of programs as a tool to the design of efficient algorithms. The new algorithm is obtained from a brute force algorithm by three refinement steps. The first step leads to the algorithm of Knuth, Morris, and Pratt that performs 2 n character comparisons in the worst case and (1 + α ) n comparisons in the average case (0<α≤0.5). Two more steps give a faster algorithm that performs 1.5 n character comparisons in the worst case and is sublinear on a random text for all patterns. Moreover, those bounds are less than the corresponding bounds of the Boyer and Moore algorithm because the Boyer and Moore algorithm performs more than 2 n character comparisons in the worst case and because there exist some patterns that require more than n character comparisons on a random text. However, if we consider the average on all the patterns of a given length, then on a random text the Boyer and Moore algorithm is sublinear too, with better performance the longer the pattern gets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call