Simple induction of (deterministic) probabilistic finite-state automata for phonotactics by stochastic gradient descent

Huteng Dai,Richard Futrell

doi:10.18653/v1/2021.sigmorphon-1.19

Abstract

We introduce a simple and highly general phonotactic learner which induces a probabilistic finite-state automaton from word-form data. We describe the learner and show how to parameterize it to induce unrestricted regular languages, as well as how to restrict it to certain subregular classes such as Strictly k-Local and Strictly k-Piecewise languages. We evaluate the learner on its ability to learn phonotactic constraints in toy examples and in datasets of Quechua and Navajo. We find that an unrestricted learner is the most accurate overall when modeling attested forms not seen in training; however, only the learner restricted to the Strictly Piecewise language class successfully captures certain nonlocal phonotactic constraints. Our learner serves as a baseline for more sophisticated methods.

Highlights

Natural language phonotactics is argued to fall in the class of regular languages, or even in a smaller class of subregular languages (Rogers et al, 2013)
We find that an unrestricted probabilistic finite-state automaton (PFA) learner performs most accurately when predicting real held-out forms, while an SP learner is most effective in learning certain nonlocal constraints
We introduced a framework for phonotactic learning based on simple induction of probabilistic finitestate automata by stochastic gradient descent

Summary

Introduction

Natural language phonotactics is argued to fall in the class of regular languages, or even in a smaller class of subregular languages (Rogers et al, 2013) This observation has motivated the study of probabilistic finite-state automata (PFAs) that generate these languages as models of phonotactics. We implement a simple method for the induction of PFAs for phonotactics from data, which can induce general regular languages in addition to languages in certain more restricted subclasses, for example, Strictly k-Local and Strictly k-Piecewise languages (Heinz, 2018; Heinz and Rogers, 2010). We evaluate our learner on corpus data from Quechua and Navajo, with a particular emphasis on the ability to learn nonlocal constraints We make both theoretical and empirical contributions. We demonstrate how Strictly Local and Strictly Piecewise constraints can be encoded within our framework, and show how informationtheoretic regularization can be applied to produce deterministic automata

Objectives

Results

Discussion

Conclusion