Impact of Word Classing on Recurrent Neural Network Language Model

Yujing Si,Jielin Pan,Yonghong Yan,Yong Liu,Yuhong Guo

doi:10.1109/gcis.2012.84

Abstract

This paper investigates the impact of word classing on the recurrent neural network language model (RNNLM), which has been recently shown to outperform many competitive language model techniques. In particular, the class-based RNNLM (CRNNLM) was proposed in to speed up both the training and testing phase of RNNLM. However, in past work, word classes for CRNNLM were simply obtained based on the frequencies of words, which is not accurate. Hence, we take a closer look at the classing and to find out whether improved classing would translate to improve performance. More specially, we explore the use of the brown algorithm, which is a classical method of word classing. In experiments with a standard test set, we find that 5% 7% relative reduction in perplexity (PPL) could be obtained by the Brown algorithm, compared to the frequency-based word-classing method.

Full Text