Limited-Memory BFGS Optimization of Recurrent Neural Network Language Models for Speech Recognition

Xunying Liu,Shansong Liu,Zhiyuan Xu,Jianwei Yu,Jinze Sha,Xie Chen,Helen Meng

doi:10.1109/icassp.2018.8461550

Abstract

Recurrent neural network language models (RNNLM) have become an increasingly popular choice for state-of-the-art speech recognition systems. RNNLMs are normally trained by minimizing the cross entropy (CE) using the stochastic gradient descent (SGD) algorithm. The SGD method only uses first-order derivatives and no higher order gradient information is used to consider the correlation between model parameters. It is unable to fully capture the curvature of the error cost function. This can lead to slow convergence in model training. In this paper, a limited-memory Broyden Fletcher Goldfarb Shannon (L-BFGS) based second order optimization technique is proposed for RNNLMs. This method efficiently approximates the matrix-vector product between the inverse Hessian and gradient vector via a recursion over past gradients with a compact memory requirement. Consistent perplexity and error rate reductions are obtained over the SGD method on two speech recognition tasks: Switchboard English and Babel Cantonese. A faster convergence and speed up in RNNLM training time was also obtained. Index Terms: recurrent neural network, language model, second order optimization, limited-memory BFGS, speech recognition

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Limited-Memory BFGS Optimization of Recurrent Neural Network Language Models for Speech Recognition

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Recurrent Neural Network Language Model Training Using Natural Gradient
Jianwei Yu ... Helen Meng
-
Jianwei Yu, et. al.Jianwei Yu ... Helen Meng
01 May 2019
01 May 2019

Investigating Bidirectional Recurrent Neural Network Language Models for Speech Recognition
X Chen ... X Liu
-
X Chen, et. al.X Chen ... X Liu
20 Aug 2017
20 Aug 2017

Joint unsupervised adaptation of n-gram and RNN language models via LDA-based hybrid mixture modeling
Ryo Masumura ... Taichi Asami
-
Ryo Masumura, et. al.Ryo Masumura ... Taichi Asami
01 Dec 2017
01 Dec 2017

Training RNN language models on uncertain ASR hypotheses in limited data scenarios
Imran Sheikh ... Irina Illina
Computer Speech & Language | VOL. 83
Imran Sheikh, et. al.Imran Sheikh ... Irina Illina
20 Aug 2023
Computer Speech & Language | VOL. 83

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Limited-Memory BFGS Optimization of Recurrent Neural Network Language Models for Speech Recognition

Abstract

Talk to us

Similar Papers