Abstract

We propose multi-way, multilingual neural machine translation. The proposed approach enables a single neural translation model to translate between multiple languages, with a number of parameters that grows only linearly with the number of languages. This is made possible by having a single attention mechanism that is shared across all language pairs. We train the proposed multi-way, multilingual model on ten language pairs from WMT'15 simultaneously and observe clear performance improvements over models trained on only one language pair. In particular, we observe that the proposed model significantly improves the translation quality of low-resource language pairs.

Highlights

  • Neural Machine Translation It has been shown that a deep neural network can successfully learn a complex mapping between variablelength input and output sequences on its own

  • The other recurrent neural network, called a decoder, generates a target sequence again of variable length starting from the context vector

  • Neural machine translation aims at building a single neural network that takes as input a source sequence X = (x1, . . . , xTx) and generates a corresponding translation Y = y1, . . . , yTy

Read more

Summary

Introduction

Neural Machine Translation It has been shown that a deep (recurrent) neural network can successfully learn a complex mapping between variablelength input and output sequences on its own. The other recurrent neural network, called a decoder, generates a target sequence again of variable length starting from the context vector This approach has been found to be inefficient in (Cho et al, 2014a) when handling long sentences, due to the difficulty in learning a complex mapping between an arbitrary long sentence and a single fixed-dimensional vector. This makes it possible conceptually to build a system that maps a source sentence in any language to a common continuous representation space and decodes the representation into any of the target languages, allowing us to make a multilingual machine translation system This possibility is straightforward to implement and has been validated in the case of basic encoderdecoder networks (Luong et al, 2015a). The experiments show that it is possible to train a single attention-based network to perform multi-way translation

Background
Existing Approaches
Challenges
Datasets
Two Scenarios
Model Architecture
Training
Results and Analysis
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call