Abstract

Automata, Logic and Semantics Our aim is to construct a finite automaton recognizing the set of words that are at a bounded distance from some word of a given regular language. We define new regular operators, the similarity operators, based on a generalization of the notion of distance and we introduce the family of regular expressions extended to similarity operators, that we call AREs (Approximate Regular Expressions). We set formulae to compute the Brzozowski derivatives and the Antimirov derivatives of an ARE, which allows us to give a solution to the ARE membership problem and to provide the construction of two recognizers for the language denoted by an ARE. As far as we know, the family of approximative regular expressions is introduced for the first time in this paper. Classical approximate regular expression matching algorithms are approximate matching algorithms on regular expressions. Our approach is rather to process an exact matching on approximate regular expressions.

Highlights

  • This paper addresses the problem of constructing a finite automaton that recognizes the language of all the words that are at a distance less than or equal to a given positive integer k from some word of a given regular language

  • We first define a new family of operators: given an integer k, the Fk operator is such that, for any regular language L, the language Fk(L) is the set of all the words that are at a distance less than or equal to k from some word of L

  • We extend the computation of Brzozowski derivatives [3] to the family of approximate regular expressions

Read more

Summary

Introduction

This paper addresses the problem of constructing a finite automaton that recognizes the language of all the words that are at a distance less than or equal to a given positive integer k from some word of a given regular language. The aim of this paper is to investigate the properties of the AREs family, in particular to define formulae for computing the set of (Brzozowski or Antimirov) derivatives of an ARE and to check the properties of this set This theoretical study leads to a solution for the approximate membership problem as well as to a solution for the approximate regular expression matching problem (based on the automaton associated with the set of derivatives of an ARE). The standard case of Hamming and Levenshtein distances is first described and illustrated in Section 4 (without any proof), while the general case is addressed in Section 5; the link between the proofs of the standard case and of the general case is shown in Subsection 5.4

Preliminaries
Comparison Functions
Hamming and Levenshtein Derivation Formulae
Brzozowski Derivatives for an HLARE
Antimirov Partial Derivatives of an HLARE
Quotient of a Language
Brzozowski Derivatives for an ARE
Antimirov Derivatives for an ARE
Back to Hamming and Levenshtein Derivation
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.