RNN language model with word clustering and class-based output layer

Yongzhe Shi,Wei-Qiang Zhang,Michael T Johnson,Jia Liu

doi:10.1186/1687-4722-2013-22

Abstract

The recurrent neural network language model (RNNLM) has shown significant promise for statistical language modeling. In this work, a new class-based output layer method is introduced to further improve the RNNLM. In this method, word class information is incorporated into the output layer by utilizing the Brown clustering algorithm to estimate a class-based language model. Experimental results show that the new output layer with word clustering not only improves the convergence obviously but also reduces the perplexity and word error rate in large vocabulary continuous speech recognition.

Highlights

Statistical language models estimate the probability of a word occurring in a given context, which plays an important role in many natural language processing applications such as speech recognition, machine translation, and information retrieval
The first two columns refer to the baseline (RNNLM-Freq-based word clustering (Freq)) and the interpolated model with Standard back-off n-gram language model (LM)-KN5 (RNNLM-Freq + KN5), respectively, which is consistent with the results reported in [4]
We can see that the word error rate (WER) for evaluation set is consistently reduced by 0.3% to 0.7% absolutely, compared with that of recurrent neural network language model (RNNLM)-Freq, and 1.4% to 1.7% compared with the 1-best hypothesis

Summary

Introduction

Statistical language models estimate the probability of a word occurring in a given context, which plays an important role in many natural language processing applications such as speech recognition, machine translation, and information retrieval. It is a reasonable assumption that similar words occur in the same context with similar probability, for example, ‘America,’ ‘China,’ and ‘Japan’ which usually come after the same preposition or as the subject of a sentence Based on this assumption, neural network language models (NNLMs) [1,2] project the discrete word. Other tree-structured output layer methods have been proposed to speed up the NNLMs [7,8]. We introduce a new method for constructing a class-based output layer using the Brown clustering algorithm. Words are roughly clustered according to their frequencies in this method, with training speed increasing but performance degraded. This paper is organized as follows: In Section 2, we introduce our baseline RNNLM and the proposed Brown clustering method for constructing the output layer.

Model description

Word clustering for output layer

Findings

Conclusions

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: EURASIP Journal on Audio, Speech, and Music Processing	Publication Date: Jul 22, 2013
Citations: 32	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

RNN language model with word clustering and class-based output layer

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EURASIP Journal on Audio, Speech, and Music Processing

Lead the way for us

Similar Papers

On efficient training of word classes and their application to recurrent neural network language models
Rami Botros ... Kazuki Irie
-
Rami Botros, et. al.Rami Botros ... Kazuki Irie
06 Sep 2015
06 Sep 2015

Variance regularization of RNNLM for speech recognition
Yongzhe Shi ... Wei-Qiang Zhang
-
Yongzhe Shi, et. al.Yongzhe Shi ... Wei-Qiang Zhang
01 May 2014
01 May 2014

Joint unsupervised adaptation of n-gram and RNN language models via LDA-based hybrid mixture modeling
Ryo Masumura ... Yushi Aono
-
Ryo Masumura, et. al.Ryo Masumura ... Yushi Aono
01 Dec 2017
01 Dec 2017

Discriminative method for recurrent neural network language models
Yuuki Tachioka ... Shinji Watanabe
-
Yuuki Tachioka, et. al.Yuuki Tachioka ... Shinji Watanabe
01 Apr 2015
01 Apr 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

RNN language model with word clustering and class-based output layer

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EURASIP Journal on Audio, Speech, and Music Processing