Data Rejuvenation: Exploiting Inactive Training Examples for Neural Machine Translation

Wenxiang Jiao,Zhaopeng Tu,Irwin King,Michael Lyu,Shilin He,Xing Wang

doi:10.18653/v1/2020.emnlp-main.176

Abstract

Large-scale training datasets lie at the core of the recent success of neural machine translation (NMT) models. However, the complex patterns and potential noises in the large-scale data make training NMT models difficult. In this work, we explore to identify the inactive training examples which contribute less to the model performance, and show that the existence of inactive examples depends on the data distribution. We further introduce data rejuvenation to improve the training of NMT models on large-scale datasets by exploiting inactive examples. The proposed framework consists of three phases. First, we train an identification model on the original training data, and use it to distinguish inactive examples and active examples by their sentence-level output probabilities. Then, we train a rejuvenation model on the active examples, which is used to re-label the inactive examples with forward-translation. Finally, the rejuvenated examples and the active examples are combined to train the final NMT model. Experimental results on WMT14 English-German and English-French datasets show that the proposed data rejuvenation consistently and significantly improves performance for several strong NMT models. Extensive analyses reveal that our approach stabilizes and accelerates the training process of NMT models, resulting in final models with better generalization capability.

Highlights

Neural machine translation (NMT) is a data-hungry approach, which requires a large amount of data to train a well-performing NMT model (Koehn and Knowles, 2017)
We explore an interesting alternative which is to reactivate the inactive examples in the training data for NMT models
We observe a high overlapping ratio of the most inactive and active examples across random seeds, model capacity, and model architectures (§4.2). These results provide empirical support for our hypothesis of the existence of inactive examples in large-scale datasets, which is invariant to specific NMT models and depends on the data distribution itself

Summary

Introduction

Neural machine translation (NMT) is a data-hungry approach, which requires a large amount of data to train a well-performing NMT model (Koehn and Knowles, 2017). We explore an interesting alternative which is to reactivate the inactive examples in the training data for NMT models. Inactive examples are the training examples that only marginally contribute to or even inversely harm the performance of NMT models. Experimental results show that removing 10% most inactive examples can marginally improve translation performance. We observe a high overlapping ratio (e.g., around 80%) of the most inactive and active examples across random seeds, model capacity, and model architectures (§4.2). These results provide empirical support for our hypothesis of the existence of inactive examples in large-scale datasets, which is invariant to specific NMT models and depends on the data distribution itself

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Data Rejuvenation: Exploiting Inactive Training Examples for Neural Machine Translation

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2020
Citations: 41	License type: cc-by

Similar Papers

Confidence Based Bidirectional Global Context Aware Training Framework for Neural Machine Translation
...
-
, et. al. ...
11 May 2022
11 May 2022

Adapting Translation Models for Transcript Disfluency Detection
Qianqian Dong ... Shuang Xu
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 33
Qianqian Dong, et. al.Qianqian Dong ... Shuang Xu
17 Jul 2019
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 33

Adversarial Subword Regularization for Robust Neural Machine Translation
Jungsoo Park ... Jinhyuk Lee
-
Jungsoo Park, et. al.Jungsoo Park ... Jinhyuk Lee
01 Jan 2020
01 Jan 2020

Measuring and Improving Faithfulness of Attention in Neural Machine Translation
Pooya Moradi ... Anoop Sarkar
-
Pooya Moradi, et. al.Pooya Moradi ... Anoop Sarkar
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Data Rejuvenation: Exploiting Inactive Training Examples for Neural Machine Translation

Abstract

Highlights

Summary

Talk to us

Similar Papers