Abstract

Case, Word Order, and Language Learnability: Insights from Connectionist Modeling Gary Lupyan (il24@cornell.edu) Morten H. Christiansen (mhc27@cornell.edu) Department of Psychology Cornell University Ithaca, NY 14853 USA Abstract How does the existence of case systems, and strict word order patterns a↵ect the learnability of a given language? We present a series of connectionist sim- ulations, suggesting that both case and strict word order may facilitate syntactic acquisition by a se- quential learning device. Our results are consis- tent with typological data concerning the frequen- cies with which di↵erent type of word order pat- terns occur across the languages of the world. Our model also accommodates patterns of syntactic de- velopment across several di↵erent languages. We conclude that non-linguistic constraints on general sequential-learning devices may help explain the re- lationship between case, word order, and learnabil- ity of individual languages. Introduction In language acquisition, children are faced with many formidable tasks, yet they normally acquire most of their native language within the first five years of life. One of the most difficult of these tasks involves mapping a sequence of words onto some sort of interpretation of what that sequence is supposed to mean. That is, in order for the child to under- stand a sentence, she needs to determine the gram- matical roles of the individual words so that she can work out who did what to whom. Although the chil- dren appear to bring powerful statistical learning mechanisms to bear on the acquisition tasks (e.g., Sa↵ran, Aslin, & Newport 1996), the existence of linguistic universals common across radically di↵er- ent languages (Greenberg 1963) points to the pres- ence of innate constraints on such learning. Without such constraints, it becomes difficult to explain why there are few, if any, Object-Subject-Verb (OSV) languages (van Everbroeck, 1999) even though in principle such a language appears to be as good as any other. In this paper, we propose that these con- straints may arise from non-linguistic limitations on the sequential learning of statistical structure, and examine how this perspective may shed light on how children learn to map the words in sentences onto their appropriate grammatical roles. There are two major ways in which languages signal syntactic rela- tionships and grammatical roles—word order (WO), and case markings. In a strict WO language like En- glish, declarative sentences follow a Subject-Verb- Object (SVO) pattern. It is the occurrence of the subject in the first position, and the object in the second, that allows the hearer to comprehend who did what to whom. In contrast, languages such as Russian or Japanese allow multiple word orders and rely on case markings to disambiguate subjects from objects. For instance, Masha lubit Petyoo (SVO), Petyoo lubit Masha (OVS), and Lubit Petyoo Masha (VOS) are all grammatical in Russian and all mean Mary loves Peter (albeit with di↵erent emphases on the constituents), due to the nominative -a, and ac- cusative -u case markers. While long-standing theories describe acquisition of language through an innate language acquisition device (e.g., Pinker, 1995), an alternative approach that is gaining ground is the adaptation of linguistic structures to the human brain rather than vice versa (e.g., Christiansen, 1994; Kirby, 1998). On this ac- count, language universals may reflect non-linguistic cognitive constraints on learning and processing of sequential structure, rather than constraints pre- scribed by an innate universal grammar. Previ- ous work has shown that sequential-learning devices with no language-specific biases are better able to learn more universal aspects of language as com- pared to aspects encountered in rare languages (e.g., Ellefson & Christiansen, 2000; Christiansen & De- vlin, 1997; Van Everbroeck, 1999, 2001). Here, we examine the ways in which case markings and word order may function as cues for a sequen- tial learning device acquiring syntactic structure. In simulation 1, we model di↵erent word orders, and hypothesize that typologically common languages should be easier to learn by a sequential-learning device than the more rare ones. We expand on this idea in simulation 2 by studying the performance of networks trained on languages of varying degrees of case markings and flexibility. Finally, in simulation 3, we establish that our trained networks are able to mimic syntactic performance of children learn- ing English, Italian, Turkish, and Serbo-Croatian (Slobin and Bever, 1982).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call