Improving Multilingual Neural Machine Translation with Auxiliary Source Languages

Wei Xu ,Haoyang Huang,Yuwei Yin,Shuming Ma,Dongdong Zhang

doi:10.18653/v1/2021.findings-emnlp.260

Abstract

Multilingual neural machine translation models typically handle one source language at a time. However, prior work has shown that translating from multiple source languages improves translation quality. Different from existing approaches on multi-source translation that are limited to the test scenario where parallel source sentences from multiple languages are available at inference time, we propose to improve multilingual translation in a more common scenario by exploiting synthetic source sentences from auxiliary languages. We train our model on synthetic multi-source corpora and apply random masking to enable flexible inference with single-source or bi-source inputs. Extensive experiments on Chinese/English-Japanese and a large-scale multilingual translation benchmark show that our model outperforms the multilingual baseline significantly by up to +4.0 BLEU with the largest improvements on low-resource or distant language pairs.

Highlights

Introduction at inference timethese models are limited to the application scenario where the sourceNeural machine translation (NMT) has achieved the state-of-the-art performance across domains and language pairs (Wu et al, 2016; Bojar et al, 2018; Hassan et al, 2018; Barrault et al, 2019)
One of the advantages of NMT over statistical machine translation models is that it enables information sharing among high-resource and low-resource languages by training a multilingual model on the parallel data from multiple language pairs, which has been shown to improve translation quality, especially on low-resource language pairs
In the more common scenario where only one source sentence is provided, we could improve the translation quality of multilingual NMT models by augmenting the source input with a synthetic sentence generated by a translation model into another language

Summary

Introduction

Introduction at inference timethese models are limited to the application scenario where the sourceNeural machine translation (NMT) has achieved the state-of-the-art performance across domains and language pairs (Wu et al, 2016; Bojar et al, 2018; Hassan et al, 2018; Barrault et al, 2019). One of the advantages of NMT over statistical machine translation models is that it enables information sharing among high-resource and low-resource languages by training a multilingual model on the parallel data from multiple language pairs, which has been shown to improve translation quality, especially on low-resource language pairs In the more common scenario where only one source sentence is provided, we could improve the translation quality of multilingual NMT models by augmenting the source input with a synthetic sentence generated by a translation model into another language. Multilingual NMT models typically multilingual NMT model that leverages a synthetic handle one language pair at a time during both source sentence from an auxiliary language to training and inference (Ha et al, 2016; Johnson better translate a source sentence into the target et al, 2017), prior work has shown that translating language.

Methods

Results

Conclusion