Investigations on Phrase-based Decoding with Recurrent Neural Network Language and Translation Models

Tamer Alkhouli,Hermann Ney,Felix Rietig

doi:10.18653/v1/w15-3034

Abstract

This work explores the application of recurrent neural network (RNN) language and translation models during phrasebased decoding. Due to their use of unbounded context, the decoder integration of RNNs is more challenging compared to the integration of feedforward neural models. In this paper, we apply approximations and use caching to enable RNN decoder integration, while requiring reasonable memory and time resources. We analyze the effect of caching on translation quality and speed, and use it to integrate RNN language and translation models into a phrase-based decoder. To the best of our knowledge, no previous work has discussed the integration of RNN translation models into phrase-based decoding. We also show that a special RNN can be integrated efficiently without the need for approximations. We compare decoding using RNNs to rescoring n-best lists on two tasks: IWSLT 2013 German→English, and BOLT Arabic→English. We demonstrate that the performance of decoding with RNNs is at least as good as using them in rescoring.

Highlights

Applying neural networks to statistical machine translation has been gaining increasing attention recently
These improvements are at least as good as those of rescoring. This applies both for the exact bidirectional translation model (BTM) as well as the approximate language model (LM) and joint model (JM) cases
The results show that recurrent neural network (RNN) LM rescoring can be improved if decoding is performed including the RNN LM

Summary

Introduction

Applying neural networks to statistical machine translation has been gaining increasing attention recently. Neural networks were used for standalone decoding using a simple beam-search word-based decoder (Sutskever et al, 2014; Bahdanau et al, 2015) Another approach is to apply neural models directly in a phrase-based decoder. We focus on this approach, which is challenging since phrase-based decoding typically involves generating tens or even hundreds of millions of partial hypotheses. Scoring such a number of hypotheses using neural models is expensive, mainly due to the usually large output layer. Decoder integration has been done in (Vaswani et al, 2013) for feedforward neural language models. Devlin et al (2014) integrate feedforward translation models into phrase-based decoding reporting major improvements, which highlight the strength of the underlying models

Objectives

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Investigations on Phrase-based Decoding with Recurrent Neural Network Language and Translation Models

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2015
Citations: 34	License type: cc-by

Similar Papers

Joint unsupervised adaptation of n-gram and RNN language models via LDA-based hybrid mixture modeling
Ryo Masumura ... Yushi Aono
-
Ryo Masumura, et. al.Ryo Masumura ... Yushi Aono
01 Dec 2017
01 Dec 2017

On efficient training of word classes and their application to recurrent neural network language models
Rami Botros ... Kazuki Irie
-
Rami Botros, et. al.Rami Botros ... Kazuki Irie
06 Sep 2015
06 Sep 2015

Investigation of back-off based interpolation between recurrent neural network and n-gram language models
X Chen ... P C Woodland
-
X Chen, et. al.X Chen ... P C Woodland
01 Dec 2015
01 Dec 2015

Scaffolded input promotes atomic organization in the recurrent neural network language model
Philip A. Huebner ... Jon A. Willits
-
Philip A. Huebner, et. al.Philip A. Huebner ... Jon A. Willits
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Investigations on Phrase-based Decoding with Recurrent Neural Network Language and Translation Models

Abstract

Highlights

Summary

Talk to us

Similar Papers