Learning to Actively Learn Neural Machine Translation

Ming Liu,Wray Buntine,Gholamreza Haffari

doi:10.18653/v1/k18-1033

Abstract

Traditional active learning (AL) methods for machine translation (MT) rely on heuristics. However, these heuristics are limited when the characteristics of the MT problem change due to e.g. the language pair or the amount of the initial bitext. In this paper, we present a framework to learn sentence selection strategies for neural MT. We train the AL query strategy using a high-resource language-pair based on AL simulations, and then transfer it to the low-resource language-pair of interest. The learned query strategy capitalizes on the shared characteristics between the language pairs to make an effective use of the AL budget. Our experiments on three language-pairs confirms that our method is more effective than strong heuristic-based methods in various conditions, including cold-start and warm-start as well as small and extremely small data conditions.

Highlights

Parallel training bitext plays a key role in the quality neural machine translation (NMT)
For each of these data conditions, we experiments with both cold-start and warm-start settings using the pre-trained multilingual word embeddings from Ammar et al (2016) or those we have trained with the available bitext plus additional monotext
The goal of this paper is to provide an approach to learn an active learning strategy for NMT based on a Hierarchical Markov Decision Process (HMDP) formulation of the pool-based AL (Bachman et al, 2017; Liu et al, 2018)

Summary

Introduction

Parallel training bitext plays a key role in the quality neural machine translation (NMT). Learning high-quality NMT models in bilingually lowresource scenarios is one of the key challenges, as NMT’s quality degrades severely in such setting (Koehn and Knowles, 2017). The importance of learning NMT models in scarce parallel bitext scenarios has gained attention. Unsupervised approaches try to learn NMT models without the need for parallel bitext (Artetxe et al, 2017; Lample et al, 2017a). Dual learning/backtranslation tries to start off from a small amount of bilingual text, and leverage monolingual text in the source and target language (Sennrich et al, 2015a; He et al, 2016). Zero/few shot approach attempts to transfer NMT learned from rich bilingual settings to low-resource settings (Johnson et al, 2016; Gu et al, 2018)

Objectives

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Learning to Actively Learn Neural Machine Translation

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2018
Citations: 37	License type: cc-by

Similar Papers

Multi-category Classification Problem Oriented Subsampling-Based Active Learning Method
Wei Shi ... Guangquan Cheng
Journal of Physics: Conference Series | VOL. 1631
Wei Shi, et. al.Wei Shi ... Guangquan Cheng
01 Sep 2020
Journal of Physics: Conference Series | VOL. 1631

Predicting Positive p53 Cancer Rescue Regions Using Most Informative Positive (MIP) Active Learning
Samuel A Danziger ... James M Briggs
PLoS Computational Biology | VOL. 5
Samuel A Danziger, et. al.Samuel A Danziger ... James M Briggs
04 Sep 2009
PLoS Computational Biology | VOL. 5

Dirichlet Process Based Active Learning and Discovery of Unknown Classes for Hyperspectral Image Classification
Hao Wu ... Saurabh Prasad
IEEE Transactions on Geoscience and Remote Sensing | VOL. 54
Hao Wu, et. al.Hao Wu ... Saurabh Prasad
01 Aug 2016
IEEE Transactions on Geoscience and Remote Sensing | VOL. 54

Active Learning for Ordinal Classification Based on Adaptive Diversity-Based Uncertainty Sampling
Deniu He
IEEE Access | VOL. 11
Deniu HeDeniu He
01 Jan 2023
IEEE Access | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning to Actively Learn Neural Machine Translation

Abstract

Highlights

Summary

Talk to us

Similar Papers