Adaptive Importance Sampling to Accelerate Training of a Neural Probabilistic Language Model

Y Bengio,J.-S Senecal

doi:10.1109/tnn.2007.912312

Adaptive Importance Sampling to Accelerate Training of a Neural Probabilistic Language Model

Y Bengio, J.-S Senecal

Open Access

https://doi.org/10.1109/tnn.2007.912312

Copy DOI

Journal: IEEE transactions on neural networks	Publication Date: Apr 1, 2008
Citations: 250

Affiliation: Université de Montréal

#Adaptive Importance Sampling #Sequences Of Words + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Previous work on statistical language modeling has shown that it is possible to train a feedforward neural network to approximate probabilities over sequences of words, resulting in significant error reduction when compared to standard baseline models based on n-grams. However, training the neural network model with the maximum-likelihood criterion requires computations proportional to the number of words in the vocabulary. In this paper, we introduce adaptive importance sampling as a way to accelerate training of the model. The idea is to use an adaptive n-gram model to track the conditional distributions produced by the neural network. We show that a very significant speedup can be obtained on standard problems.

Full Text