Abstract

Given a text T and a set of r patterns P1,P2,…,Pr, the exact multiple pattern matching problem reports the ending positions of all occurrences of Pi in T for 1≤i≤r. By transforming all substrings with a fixed length of T into a reference tree such that each internal node stores a reference string, the exact multiple pattern matching problem can be efficiently solved by searching patterns in the tree via the guidance of the reference strings. We design elegant algorithms to construct the reference tree (the preprocessing phase) and to search patterns in the tree (the searching phase) using bitwise operations. The experiments involving problem instances from the DNA sequence and the English language are conducted to compare the performance of our approach against those of the suffix tree and suffix array algorithms. The computational results demonstrate the advantage of our approach over these algorithms. In spite of the simplicity, our approach is quite efficient, flexible and robust.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call