Dual Transfer Learning for Neural Machine Translation with Marginal Distribution Regularization

Yijun Wang,Jiang Bian,Yingce Xia,Li Zhao,Tao Qin,Guiquan Liu,Tie-Yan Liu

doi:10.1609/aaai.v32i1.11999

Abstract

Neural machine translation (NMT) heavily relies on parallel bilingual data for training. Since large-scale, high-quality parallel corpora are usually costly to collect, it is appealing to exploit monolingual corpora to improve NMT. Inspired by the law of total probability, which connects the probability of a given target-side monolingual sentence to the conditional probability of translating from a source sentence to the target one, we propose to explicitly exploit this connection to learn from and regularize the training of NMT models using monolingual data. The key technical challenge of this approach is that there are exponentially many source sentences for a target monolingual sentence while computing the sum of the conditional probability given each possible source sentence. We address this challenge by leveraging the dual translation model (target-to-source translation) to sample several mostly likely source-side sentences and avoid enumerating all possible candidate source sentences. That is, we transfer the knowledge contained in the dual model to boost the training of the primal model (source-to-target translation), and we call such an approach dual transfer learning. Experiment results on English-French and German-English tasks demonstrate that dual transfer learning achieves significant improvement over several strong baselines and obtains new state-of-the-art results.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Dual Transfer Learning for Neural Machine Translation with Marginal Distribution Regularization

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Apr 27, 2018
Citations: 61

Similar Papers

Semi-Supervised Neural Machine Translation via Marginal Distribution Estimation
Yijun Wang ... Tie-Yan Liu
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 27
Yijun Wang, et. al.Yijun Wang ... Tie-Yan Liu
01 Oct 2019
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 27

Multilingual Neural Translation

-

14 Feb 2020
14 Feb 2020

Improving Neural Machine Translation Models with Monolingual Data
Rico Sennrich ... Alexandra Birch
-
Rico Sennrich, et. al.Rico Sennrich ... Alexandra Birch
01 Jan 2015
01 Jan 2015

Adapting Attention-Based Neural Network to Low-Resource Mongolian-Chinese Machine Translation
Jing Wu ... Jian Du
-
Jing Wu, et. al.Jing Wu ... Jian Du
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Dual Transfer Learning for Neural Machine Translation with Marginal Distribution Regularization

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence