Automata generation based on recurrent neural networks and automated cauterization selection

Anatoly Shalyto,Andrey Filchenkov,Petr Grachev,Sergey Muravyov

doi:10.31799/1684-8853-2020-1-34-43

Abstract

Intoduction: The regular inference problem is to synthesize deterministic finite-state automata by a list of words which are examplesand counterexamples of some unknown regular language. This problem is one of the main in the theory of formal languages and relatedfields. One of the most successful solutions to this problem is training a recurrent neural network on word classification and clusteringthe vectors in the space of RNN inner weights. However, it is not guaranteed that a consistent automaton can be constructed based onthe clustering results. More complex models require more memory, training time and training samples. Purpose: Creating a brand newgrammar inference algorithm which would use modern machine learning methods. Methods: A recurrent neural network with an errorfunction proposed by the authors was used for classification. For clustering, the method of joint selection and tuning of hyperparameterwas used. Results: Ten different datasets were used for testing the models, corresponding to ten different regular grammars and tenautomata. According to the test results, the developed model successfully synthesize automata with no more than five input charactersand states. For four grammars, out of the seven successfully inferred ones, the constructed automaton was minimal. For three datasets,an automaton could not be built, either because of an insufficient number of clusters in the proposed partition, or because of the inabilityto build a consistent automaton for this partition. Discussion: Applying the algorithm of search for maximum likelihood between theclusters of vector and the corresponding states in order to resolve structural conflicts may expand the scope of the model.

Full Text