Canonical derivatives, partial derivatives and finite automaton constructions

J.-M Champarnaud,D Ziadi

doi:10.1016/s0304-3975(01)00267-5

Abstract

Let E be a regular expression. Our aim is to establish a theoretical relation between two well-known automata recognizing the language of E, namely the position automaton P E constructed by Glushkov or McNaughton and Yamada, and the equation automaton E E constructed by Mirkin or Antimirov. We define the notion of c-derivative (for canonical derivative) of a regular expression E and show that if E is linear then two Brzozowski's derivatives of E are aci-similar if and only if the corresponding c-derivatives are identical. It allows us to represent the Berry–Sethi's set of continuations of a position by a unique c-derivative, called the c-continuation of the position. Hence the definition of C E , the c-continuation automaton of E, whose states are pairs made of a position of E and of the associated c-continuation. If states are viewed as positions, C E is isomorphic to P E . On the other hand, a partial derivative, as defined by Antimirov, is a class of c-derivatives for some equivalence relation, thus C E reduces to E E . Finally C E makes it possible to go from P E to E E , while this cannot be achieved directly (from the state graphs). These theoretical results lead to an O(|E| 2) space and time algorithm to compute the equation automaton, where |E| is the size of the expression. This is the complexity of the most efficient constructions yielding the position automaton, while the size of the equation automaton is not greater and generally much smaller than the size of the position automaton.

Full Text