What Homophones Say about Words.

Isabelle Dautriche,Emmanuel Chemla

doi:10.1371/journal.pone.0162176

Abstract

The number of potential meanings for a new word is astronomic. To make the word-learning problem tractable, one must restrict the hypothesis space. To do so, current word learning accounts often incorporate constraints about cognition or about the mature lexicon directly in the learning device. We are concerned with the convexity constraint, which holds that concepts (privileged sets of entities that we think of as “coherent”) do not have gaps (if A and B belong to a concept, so does any entity “between” A and B). To leverage from it a linguistic constraint, learning algorithms have percolated this constraint from concepts, to word forms: some algorithms rely on the possibility that word forms are associated with convex sets of objects. Yet this does have to be the case: homophones are word forms associated with two separate words and meanings. Two sets of experiments show that when evidence suggests that a novel label is associated with a disjoint (non-convex) set of objects, either a) because there is a gap in conceptual space between the learning exemplars for a given word or b) because of the intervention of other lexical items in that gap, adults prefer to postulate homophony, where a single word form is associated with two separate words and meanings, rather than inferring that the word could have a disjunctive, discontinuous meaning. These results about homophony must be integrated to current word learning algorithms. We conclude by arguing for a weaker specialization of word learning algorithms, which too often could miss important constraints by focusing on a restricted empirical basis (e.g., non-homophonous content words).

Highlights

Learning the word “cat” implies associating the sequence of sounds /kaet/ to the set of all cats and only cats
Two sets of experiments show that when evidence suggests that a novel label is associated with a disjoint set of objects, either a) because there is a gap in conceptual space between the learning exemplars for a given word or b) because of the intervention of other lexical items in that gap, adults prefer to postulate homophony, where a single word form is associated with two separate words and meanings, rather than inferring that the word could have a disjunctive, discontinuous meaning
We showed an effect of the distribution of the learning exemplars in conceptual space: observing exemplars clustered at two distant positions in the hypothesis space boosted the likelihood that the exemplars were sampled from two independent categories

Summary

Introduction

Learning the word “cat” implies associating the sequence of sounds /kaet/ to the set of all cats and only cats. Quite generally one description of the meaning of a content word is its “extension”, i.e. the set of all entities to which that word refers (an idea discussed in detail in the tradition of formal semantics at least since [1]). Language learners need to infer the extension of a word based on a set of exemplars that surely do not exhaust that extension. The underlying inference problem would be unsolvable without prior knowledge, most notably some that could constrain the hypothesis space, which is the set of potential meanings for words (e.g., [2]–[5]; and [6] for a formal proof). One way in which learners may reduce their hypothesis space is by privileging some meanings over others. Toddlers and preschoolers prefer to extend a novel word

Results

Discussion

Conclusion