Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism

Orhan Firat,Kyunghyun Cho,Yoshua Bengio

doi:10.18653/v1/n16-1101

Abstract

We propose multi-way, multilingual neural machine translation. The proposed approach enables a single neural translation model to translate between multiple languages, with a number of parameters that grows only linearly with the number of languages. This is made possible by having a single attention mechanism that is shared across all language pairs. We train the proposed multi-way, multilingual model on ten language pairs from WMT'15 simultaneously and observe clear performance improvements over models trained on only one language pair. In particular, we observe that the proposed model significantly improves the translation quality of low-resource language pairs.

Highlights

Neural Machine Translation It has been shown that a deep neural network can successfully learn a complex mapping between variablelength input and output sequences on its own
The other recurrent neural network, called a decoder, generates a target sequence again of variable length starting from the context vector
Neural machine translation aims at building a single neural network that takes as input a source sequence X = (x1, . . . , xTx) and generates a corresponding translation Y = y1, . . . , yTy

Summary

Introduction

Neural Machine Translation It has been shown that a deep (recurrent) neural network can successfully learn a complex mapping between variablelength input and output sequences on its own. The other recurrent neural network, called a decoder, generates a target sequence again of variable length starting from the context vector This approach has been found to be inefficient in (Cho et al, 2014a) when handling long sentences, due to the difficulty in learning a complex mapping between an arbitrary long sentence and a single fixed-dimensional vector. This makes it possible conceptually to build a system that maps a source sentence in any language to a common continuous representation space and decodes the representation into any of the target languages, allowing us to make a multilingual machine translation system This possibility is straightforward to implement and has been validated in the case of basic encoderdecoder networks (Luong et al, 2015a). The experiments show that it is possible to train a single attention-based network to perform multi-way translation

Background

Existing Approaches

Challenges

Datasets

Two Scenarios

Model Architecture

Training

Results and Analysis

Conclusion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2016
Citations: 466	License type: cc-by

Similar Papers

Multi-way, multilingual neural machine translation
Orhan Firat ... Yoshua Bengio
Computer Speech & Language | VOL. 45
Orhan Firat, et. al.Orhan Firat ... Yoshua Bengio
10 Nov 2016
Computer Speech & Language | VOL. 45

Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information
Zehui Lin ... Jiangtao Feng
-
Zehui Lin, et. al.Zehui Lin ... Jiangtao Feng
01 Jan 2020
01 Jan 2020

Language relatedness evaluation for multilingual neural machine translation
Chenggang Mi ... Shaoliang Xie
Neurocomputing | VOL. 570
Chenggang Mi, et. al.Chenggang Mi ... Shaoliang Xie
12 Dec 2023
Neurocomputing | VOL. 570

Improving Multilingual Neural Machine Translation with Auxiliary Source Languages
... Dongdong Zhang
-
, et. al. ... Dongdong Zhang
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism

Abstract

Highlights

Summary

Talk to us

Similar Papers