Abstract

Experimental research has uncovered language learners’ ability to frequency-match to statistical generalizations across the lexicon, while also acquiring the idiosyncratic behavior of individual attested words. How can we model the learning of a frequency-matching grammar together with lexical idiosyncrasy? A recent approach based in the single-level regression model Maximum Entropy Harmonic Grammar makes use of general constraints that putatively capture statistical generalizations across the lexicon, as well as lexical constraints governing the behavior of individual words. I argue on the basis of learning simulations that the approach fails to learn statistical generalizations across the lexicon, running into what I call the GRAMMAR-LEXICON BALANCING PROBLEM: lexical constraints are so powerful that the learner comes to acquire the behavior of each attested form using only these constraints, at which point the general constraint is rendered superfluous and ineffective. I argue that MaxEnt be replaced with the HIERARCHICAL REGRESSION MODEL: multiple layers of regression structure, corresponding to different levels of a hierarchy of generalizations. Hierarchical regression is shown to surmount the grammar-lexicon balancing problem—learning a frequency-matching grammar together with lexical idiosyncrasy—by encoding general constraints as fixed effects and lexical constraints as a random effect. The model is applied to variable Slovenian palatalization, with promising results.

Highlights

  • Language learners can acquire principles that have differing levels of generalization

  • How can we model the learning of a frequency-matching grammar together with lexical idiosyncrasy? A recent approach based in the single-level regression model Maximum Entropy Harmonic Grammar (MaxEnt) makes use of general constraints that putatively capture statistical generalizations across the lexicon, as well as lexical constraints governing the behavior of individual words

  • Lexical constraints are too powerful: they come to learn each word’s behavior, during which time frequency matching to the overall rate ceases and the general constraint becomes ineffective for modeling the grammar

Read more

Summary

Introduction

Language learners can acquire principles that have differing levels of generalization. Experimental research has uncovered language learners’ ability to frequency-match to statistical generalizations across the lexicon, while acquiring the idiosyncratic behavior of individual attested words. A recent approach based in the single-level regression model Maximum Entropy Harmonic Grammar (MaxEnt) makes use of general constraints that putatively capture statistical generalizations across the lexicon, as well as lexical constraints governing the behavior of individual words. Lexical constraints are too powerful: they come to learn each word’s behavior, during which time frequency matching to the overall rate ceases and the general constraint becomes ineffective for modeling the grammar. In a wug test (Berko 1958), the investigators presented to Hungarian native speakers fake stems with the same shape, and asked whether they would inflect them with -nɛk or -nɔk Their responses, in aggregate, matched the frequencies in the corpus very closely:. The task at hand is to model the learning of a system of two levels of generalization such as this one

The MaxEnt approach leads to the grammar-lexicon balancing problem
The hierarchical solution to grammar and lexicon
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call