Bayesian recurrent neural network language model

Jen-Tzung Chien,Yuan-Chu Ku

doi:10.1109/slt.2014.7078575

Abstract

This paper presents a Bayesian approach to construct the recurrent neural network language model (RNN-LM) for speech recognition. Our idea is to regularize the RNN-LM by compensating the uncertainty of the estimated model parameters which is represented by a Gaussian prior. The objective function in Bayesian RNN (BRNN) is formed as the regularized cross entropy error function. The regularized model is not only constructed by training the regularized parameters according to the maximum a posteriori criterion but also estimating the Gaussian hyperparameter by maximizing the marginal likelihood. A rapid approximation to Hessian matrix is developed by selecting a small set of salient outer-products and illustrated to be effective for BRNN-LM. BRNN-LM achieves sparser model than RNN-LM. Experiments on different corpora show promising improvement by applying BRNN-LM using different amount of training data.

Full Text