Learning to Discover, Ground and Use Words with Segmental Neural Language Models

Kazuya Kawakami,Chris Dyer,Phil Blunsom

doi:10.18653/v1/p19-1645

Abstract

We propose a segmental neural language model that combines the generalization power of neural networks with the ability to discover word-like units that are latent in unsegmented character sequences. In contrast to previous segmentation models that treat word segmentation as an isolated task, our model unifies word discovery, learning how words fit together to form sentences, and, by conditioning the model on visual context, how words' meanings ground in representations of non-linguistic modalities. Experiments show that the unconditional model learns predictive distributions better than character LSTM models, discovers words competitively with nonparametric Bayesian word segmentation models, and that modeling language conditional on visual context improves performance on both.

Highlights

How infants discover words that make up their first language is a long-standing question in developmental psychology (Saffran et al, 1996)
The best language models, measured in terms of their ability to predict language, segment quite poorly (Chung et al, 2017; Wang et al, 2017; Kádár et al, 2018), while the strongest models in terms of word segmentation (Goldwater et al, 2009; Berg-Kirkpatrick et al, 2010) do not adequately account for the long-range dependencies that are manifest in language and that are captured by recurrent neural networks (Mikolov et al, 2010)
We argue that such a unified model is preferable to a pipeline model of language acquisition

Summary

Introduction

How infants discover words that make up their first language is a long-standing question in developmental psychology (Saffran et al, 1996). We introduce a single model that discovers words, learns how they fit together (not just locally, but across a complete sentence), and grounds them in learned representations of naturalistic non-linguistic visual contexts. We argue that such a unified model is preferable to a pipeline model of language acquisition (e.g., a model where words are learned by one character-aware model, and a full-sentence grammar is acquired by a second language model using the words predicted by the first). This assumption has been employed in a number of related models to permit the use of LSTMs to represent rich history while retaining the convenience of dynamic programming inference algorithms (Wang et al, 2017; Ling et al, 2017; Graves, 2012)

Segment generation

Inference

Expected length regularization

Datasets

English

Chinese

Image Caption Dataset

Experiments

Results

Related Work

A Dataset statistics

C SNLM Model Configuration

E Learning

F Evaluation Metrics

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Learning to Discover, Ground and Use Words with Segmental Neural Language Models

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2019
Citations: 58	License type: cc-by

Similar Papers

Statistical segmentation and word modeling techniques in isolated word recognition
S.A Euler ... B.-H Juang
-
S.A Euler, et. al.S.A Euler ... B.-H Juang
03 Apr 1990
03 Apr 1990

An embedded segmental K-means model for unsupervised segmentation and clustering of speech
Herman Kamper ... Sharon Goldwater
-
Herman Kamper, et. al.Herman Kamper ... Sharon Goldwater
01 Dec 2017
01 Dec 2017

Equivalence of Segmental and Neural Transducer Modeling: A Proof of Concept
Wei Zhou ... André Merboldt
-
Wei Zhou, et. al.Wei Zhou ... André Merboldt
30 Aug 2021
30 Aug 2021

Fully unsupervised small-vocabulary speech recognition using a segmental Bayesian model
Herman Kamper ... Aren Jansen
-
Herman Kamper, et. al.Herman Kamper ... Aren Jansen
06 Sep 2015
06 Sep 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning to Discover, Ground and Use Words with Segmental Neural Language Models

Abstract

Highlights

Summary

Talk to us

Similar Papers