Abstract

Bayesian Learning at the Syntax-Semantics Interface Sourabh Niyogi (niyogi@mit.edu) Massachusetts Institute of Technology Cambridge, MA USA Abstract Given a small number of examples of scene- utterance pairs of a novel verb, language learners can learn its syntactic and semantic features. Syn- tactic and semantic bootstrapping hypotheses both rely on cross-situational observation to hone in on the ambiguity present in a single observation. In this paper, we cast the distributional evidence from scenes and syntax in a unified Bayesian probablistic framework. Unlike previous approaches to model- ing lexical acquisition, our framework uniquely: (1) models learning from only a small number of scene- utterance pairs (2) utilizes and integrates both syn- tax and semantic evidence, thus reconciling the apparent tension between syntactic and semantic bootststrapping approaches (3) robustly handles noise (4) makes prior and acquired knowledge dis- tinctions explicit, through specification of the hy- pothesis space, prior and likelihood probability dis- tributions. Learning Word Syntax and Semantics Given a small number of examples of scene-utterance pairs of a novel word, a child can determine both the range of syntactic constructions the novel word can appear in and inductively generalize to other scene instances likely to be covered by the concept repre- sented (Pinker 1989). The inherent semantic, syn- tactic, and referential uncertainty in a single scene- utterance pair is well-established (c.f. Siskind 1996). In contrast, with multiple scene-utterance pairs, lan- guage learners can reduce the uncertainty of which semantic features and syntactic features are associ- ated with a novel word. Verbs exemplify the core problems of scene- utterance referential uncertainty. Verbs selectively participate in different alternation patterns, which are cues to their inherent semantic and syntac- tic features (Levin 1993). How are these features of words acquired, given only positive evidence of scene-utterance pairs? The syntactic bootstrapping hypothesis (Gleitman 1990) is that learners exploit the distribution of “syntactic frames” to constrain possible semantic features of verbs. If a learner hears /glip/ in frames of the form /S glipped G with F/ and rarely hears /S glipped F into G/, the learner can with high confi- dence infer /glip/ to be in the same verb class as /fill/ and have the same sort of argument struc- ture. A different distribution informs the learner of a different verb class. Considerable evidence has mounted in support of this hypothesis (c.f. Naigles 1990, Fisher et al 1994). In contrast, the semantic bootstrapping hypothesis (Pinker 1989) is that learn- ers use what is common across scenes to constrain the possible word argument structures. If a learner sees a liquid undergoing a location change when /S glipped F/ is uttered, then /glip/ is likely to be in the same verb class as /pour/ and have the same sort of meaning. Both hypotheses require the distribution of cross- situational observations. Prior accounts to model word learning have either ignored the essential role of syntax in word learning (Siskind 1996, Tenenbaum and Xu 2000), or require thousands of training ob- servations (Regier et al 2001) to enable learning. In this paper we present a Bayesian model of learning the syntax and semantics of verbs that overcomes these barriers, by demonstrating how word-concept mappings can be achieved from very little evidence, where the evidence is information from both scenes and syntax. Bayesian Learning of Features We illustrate our approach with a Bayesian analysis of a single feature. On some accounts, verbs pos- sess a cause feature which may be valued 1, *, or 0 (Harley and Noyer 2000); depending on the value of the cause feature, the verb may appear in frame F1, F0, or both: 1 Externally caused - Ex: touch, load F1: He touched the glass. F0: *The glass touched. * Externally causable - Ex: break, fill F1: He broke the glass. F0: The glass broke. 0 Internally caused - Ex: laugh, glow F1: *He laughed the children. F0: The children laughed. Assuming this analysis, learners who hear utterances containing a novel verb, not knowing the value of its cause feature, must choose between 3 distinct hy- potheses H 1 , H § , and H 0 . Clearly, one utterance cannot uniquely determine the value of the feature: if learners hear F1 (/S Ved O/), the feature sup-

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call