Unsupervised Extraction of Partial Translations for Neural Machine Translation

Benjamin Marie,Atsushi Fujita

doi:10.18653/v1/n19-1384

Abstract

In neural machine translation (NMT), monolingual data are usually exploited through a so-called back-translation: sentences in the target language are translated into the source language to synthesize new parallel data. While this method provides more training data to better model the target language, on the source side, it only exploits translations that the NMT system is already able to generate using a model trained on existing parallel data. In this work, we assume that new translation knowledge can be extracted from monolingual data, without relying at all on existing parallel data. We propose a new algorithm for extracting from monolingual data what we call partial translations: pairs of source and target sentences that contain sequences of tokens that are translations of each other. Our algorithm is fully unsupervised and takes only source and target monolingual data as input. Our empirical evaluation points out that our partial translations can be used in combination with back-translation to further improve NMT models. Furthermore, while partial translations are particularly useful for low-resource language pairs, they can also be successfully exploited in resource-rich scenarios to improve translation quality.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Unsupervised Extraction of Partial Translations for Neural Machine Translation

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Multilingual Neural Translation

-

14 Feb 2020
14 Feb 2020

Baidu Translate: Research and Products
Zhongjun He
-
Zhongjun HeZhongjun He
01 Jan 2015
01 Jan 2015

Neural Machine Translation with Reconstruction
Zhaopeng Tu ... Yang Liu
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 31
Zhaopeng Tu, et. al.Zhaopeng Tu ... Yang Liu
12 Feb 2017
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 31

Synthesizing Parallel Data of User-Generated Texts with Zero-Shot Neural Machine Translation
Benjamin Marie ... Atsushi Fujita
Transactions of the Association for Computational Linguistics | VOL. 8
Benjamin Marie, et. al.Benjamin Marie ... Atsushi Fujita
01 Dec 2020
Transactions of the Association for Computational Linguistics | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Unsupervised Extraction of Partial Translations for Neural Machine Translation

Abstract

Talk to us

Similar Papers