Abstract

In a simple pattern matching problem one has a pattern w and a text t, which are words over a finite alphabet Σ. One may ask whether w occurs in t, and if so, where? More generally, we may have a set P of patterns and a set T of texts, where P and T are regular languages. We are interested whether any word of T begins with a word of P, ends with a word of P, has a word of P as a factor, or has a word of P as a subsequence. Thus we are interested in the languages (PΣ⁎)∩T, (Σ⁎P)∩T, (Σ⁎PΣ⁎)∩T, and ▪, where ▪ is the shuffle operation. The state complexity κ(L) of a regular language L is the number of states in the minimal deterministic finite automaton recognizing L. We derive the following upper bounds on the state complexities of our pattern-matching languages, where κ(P)⩽m, and κ(T)⩽n: κ((PΣ⁎)∩T)⩽mn; κ((Σ⁎P)∩T)⩽2m−1n; κ((Σ⁎PΣ⁎)∩T)⩽(2m−2+1)n; and ▪. We prove that these bounds are tight, and that to meet them, the alphabet must have at least two letters in the first three cases, and at least m−1 letters in the last case.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.