Combining SMT and NMT Back-Translated Data for Efficient NMT

Alberto Poncelas,School Of Computing, Dcu, Adapt Centre, Ireland ,Gideon Maillette De Buy Wenniger,Maja Popović,Andy Way,Dimitar Shterionov

doi:10.26615/978-954-452-056-4_107

Alberto Poncelas, School Of Computing, Dcu, Adapt Centre, Ireland + Show 4 more

Open Access

PDF Available

https://doi.org/10.26615/978-954-452-056-4_107

Copy DOI

Export

Save

Cite

Publication Date: Oct 22, 2019
Citations: 12	License type: cc-by-nc-sa

Affiliation: Dublin City University

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Neural Machine Translation (NMT) models achieve their best performance when large sets of parallel data are used for training. Consequently, techniques for augmenting the training set have become popular recently. One of these methods is back-translation (Sennrich et al., 2016), which consists on generating synthetic sentences by translating a set of monolingual, target-language sentences using a Machine Translation (MT) model. Generally, NMT models are used for back-translation. In this work, we analyze the performance of models when the training data is extended with synthetic data using different MT approaches. In particular we investigate back-translated data generated not only by NMT but also by Statistical Machine Translation (SMT) models and combinations of both. The results reveal that the models achieve the best performances when the training set is augmented with back-translated data created by merging different MT approaches.

Full Text