RNNDROP: A novel dropout for RNNS in ASR

Taesup Moon,Hoshik Lee,Heeyoul Choi,Inchul Song

doi:10.1109/asru.2015.7404775

Abstract

Recently, recurrent neural networks (RNN) have achieved the state-of-the-art performance in several applications that deal with temporal data, e.g., speech recognition, handwriting recognition and machine translation. While the ability of handling long-term dependency in data is the key for the success of RNN, combating over-fitting in training the models is a critical issue for achieving the cutting-edge performance particularly when the depth and size of the network increase. To that end, there have been some attempts to apply the dropout, a popular regularization scheme for the feed-forward neural networks, to RNNs, but they do not perform as well as other regularization scheme such as weight noise injection. In this paper, we propose rnnDrop, a novel variant of the dropout tailored for RNNs. Unlike the existing methods where dropout is applied only to the non-recurrent connections, the proposed method applies dropout to the recurrent connections as well in such a way that RNNs generalize well. Our experiments show that rnnDrop is a better regularization method than others including weight noise injection. Namely, when deep bidirectional long short-term memory (LSTM) RNNs were trained with rnnDrop as acoustic models for phoneme and speech recognition, they significantly outperformed the current state-of-the-arts; we achieved the phoneme error rate of 16.29% on the TIMIT core test set for phoneme recognition and the word error rate of 5.53% on the Wall Street Journal (WSJ) dataset, dev93, for speech recognition, which are the best reported results on both of the datasets.

Full Text