Abstract

We present a simple and faster solution to the problem of matching a set of patterns with variable length don't cares. Given an alphabet Σ, a pattern p is a word p 1 @ p 2 ⋯ @ p m , where p i is a string over Σ called a keyword and @ ∉ Σ is a symbol called a variable length don't care (VLDC) symbol. Pattern p matches a text t if t = u 0 p 1 u 1 … u m − 1 p m u m for some u 0 , … , u m ∈ Σ ∗ . The problem addressed in this paper is: given a set of patterns P and a text t, determine whether one of the patterns of P matches t. Kucherov and Rusinowitch (1997) [9] presented an algorithm that solves the problem in time O ( ( | t | + | P | ) log | P | ) , where | P | is the total length of keywords in every pattern of P . We give a new algorithm based on Aho–Corasick automaton. It uses the solutions of Dynamic Marked Ancestor Problem of Chan et al. (2007) [5]. The algorithm takes O ( ( | t | + ‖ P ‖ ) log κ / log log κ ) time, where ‖ P ‖ is the total number of keywords in every pattern of P , and κ is the number of distinct keywords in P . The algorithm is faster and simpler than the previous approach.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.