Abstract

Over the last few years two promising research directions in low-resource neural machine translation (NMT) have emerged. The first focuses on utilizing high-resource languages to improve the quality of low-resource languages via multilingual NMT. The second direction employs monolingual data with self-supervision to pre-train translation models, followed by fine-tuning on small amounts of supervised data. In this work, we join these two lines of research and demonstrate the efficacy of monolingual data with self-supervision in multilingual NMT. We offer three major results: (i) Using monolingual data significantly boosts the translation quality of low-resource languages in multilingual models. (ii) Self-supervision improves zero-shot translation quality in multilingual models. (iii) Leveraging monolingual data with self-supervision provides a viable path towards adding new languages to multilingual models, getting up to 33 BLEU on ro-en translation without any parallel data or back-translation.

Highlights

  • Recent work has demonstrated the efficacy of multilingual neural machine translation on improving the translation quality of low-resource languages (Firat et al, 2016; Aharoni et al, 2019) as well as zero-shot translation (Ha et al, 2016; Johnson et al, 2017; Arivazhagan et al, 2019b)

  • The most interesting aspect of this work, is that we introduce a path towards effectively adding new unseen languages to a multilingual neural machine translation (NMT) model, showing strong translation quality on several language pairs by leveraging only monolingual data with self-supervised learning, without the need for any parallel data for the new languages

  • Low-Resource Translation From Figure 2, we observe that our supervised multilingual NMT model significantly improves the translation quality for most low and medium-resource languages compared with the bilingual baselines

Read more

Summary

Introduction

Recent work has demonstrated the efficacy of multilingual neural machine translation (multilingual NMT) on improving the translation quality of low-resource languages (Firat et al, 2016; Aharoni et al, 2019) as well as zero-shot translation (Ha et al, 2016; Johnson et al, 2017; Arivazhagan et al, 2019b). The success of multilingual NMT on low-resource languages relies heavily on transfer learning from high-resource languages for which copious amounts of parallel data is accessible. Compared with multilingual models trained without any monolingual data, our approach shows consistent improvements in the translation quality of all languages, with greater than 10 BLEU points improvements on certain low-resource languages. The most interesting aspect of this work, is that we introduce a path towards effectively adding new unseen languages to a multilingual NMT model, showing strong translation quality on several language pairs by leveraging only monolingual data with self-supervised learning, without the need for any parallel data for the new languages. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 2827–2835 July 5 - 10, 2020. c 2020 Association for Computational Linguistics xx cs fr ru zh es fi de et lv lt ro hi kk tr gu

Experimental Setup
Adapting MASS for multilingual models
Datasets
Data Sampling
Architecture and Optimization
Using Monolingual Data for Multilingual NMT
Adding New Languages to Multilingual NMT
Related Work
Conclusion and Future Directions
Findings
A Appendices
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.