Joint unsupervised adaptation of n-gram and RNN language models via LDA-based hybrid mixture modeling

Ryo Masumura,Yushi Aono,Hirokazu Masataki,Taichi Asami

doi:10.1109/apsipa.2017.8282277

Ryo Masumura, Yushi Aono + Show 2 more

https://doi.org/10.1109/apsipa.2017.8282277

Copy DOI

Export

Save

Cite

Publication Date: Dec 1, 2017

Citations: 1

Affiliation: NTT (Japan)

Abstract
Full-Text
Similar Papers

Abstract

Listen

This paper reports an initial study of unsupervised adaptation that assumes simultaneous use of both n-gram and recurrent neural network (RNN) language models (LMs) in automatic speech recognition (ASR). It is known that a combination of n-grams and RNN LMs is a more effective approach to ASR than using each of them singly. However, unsupervised adaptation methods that simultaneously adapt both n-grams and RNN LMs have not been presented while various unsupervised adaptation methods specific to either n-gram LMs or RNN LMs have been examined. In order to handle different LMs in a unified unsupervised adaptation framework, our key idea is to introduce mixture modeling for both n-gram LMs and RNN LMs. The mixture modeling can simultaneously handle multiple LMs and unsupervised adaptation can be easily accomplished merely by adjusting their mixture weights using a recognition hypothesis of an input speech. This paper proposes joint unsupervised adaptation achieved by a hybrid mixture modeling using both n-gram mixture models and RNN mixture models. We present latent Dirichlet allocation based hybrid mixture modeling for effective topic adaptation. Our experiments in lecture ASR tasks show the effectiveness of joint unsupervised adaptation. We also reveal performance in which only one n-gram or RNN LM is adapted.

Full Text