Typology emerges from simplicity in representations and learning

Dakotah Jay Lambert,Jeffrey Heinz,Jonathan Rawski

doi:10.15398/jlm.v9i1.262

Dakotah Jay Lambert, Jeffrey Heinz + Show 1 more

Open Access

https://doi.org/10.15398/jlm.v9i1.262

Copy DOI

Abstract

We derive well-understood and well-studied subregular classes of formal languages purely from the computational perspective of algorithmic learning problems. We parameterise the learning problem along dimensions of representation and inference strategy. Of special interest are those classes of languages whose learning algorithms are necessarily not prohibitively expensive in space and time, since learners are often exposed to adverse conditions and sparse data. Learned natural language patterns are expected to be most like the patterns in these classes, an expectation supported by previous typological and linguistic research in phonology. A second result is that the learning algorithms presented here are completely agnostic to choice of linguistic representation. In the case of the subregular classes, the results fall out from traditional model-theoretic treatments of words and strings. The same learning algorithms, however, can be applied to model-theoretic treatments of other linguistic representations such as syntactic trees or autosegmental graphs, which opens a useful direction for future research.

Highlights

This paper presents an analysis supporting the view that the computational simplicity of learning mechanisms has considerable impact on the types of patterns found in natural languages.First, we derive well-understood and well-studied subregular classes of formal languages purely from the computational perspective of algorithmic learning problems
Simplicity in representations and learning some require substantially more space than others to properly account for the distinctions that must be made in the course of learning, and we argue that this alone would cause linguistic typology to tend toward the simpler, less space-intensive classes
This paper showed how the nature of phonological typology emerges from simple representations and inference strategies

Summary

INTRODUCTION

This paper presents an analysis supporting the view that the computational simplicity of learning mechanisms has considerable impact on the types of patterns found in natural languages. Say, the last element in a string comes after the first is immediately accessible from the model, this distinction collapses the notions of immediate and general structural adjacency Building on this precedence relation we can derive different types of relational structure. This section has shown concretely how relational structures provide a uniform language for describing the structural information in representations of words In this way, the differences between distinct subregular classes are isolated according to the relevant structural information. The non-local information that is immediately present in the precedence model requires more work in the successor model, its transitive reduct These properties are encoded into the grammars being learned, and directly carve out the properties of classes of languages that result from a particular learning algorithm inferring such structures. Despite the fact that ‘aa’ occurs as a substring three distinct times, Algorithm IV saturates at a count of 2 under these assumed parameters

II III IV

Findings

CONCLUSION