Character-Aware Low-Resource Neural Machine Translation with Weight Sharing and Pre-training

Yichao Cao,Tao Feng,Miao Li,Rujing Wang

doi:10.1007/978-3-030-32381-3_26

Abstract

Neural Machine Translation (NMT) has recently achieved the state-of-the-art in many machine translation tasks, but one of the challenges that NMT faces is the lack of parallel corpora, especially for low-resource language pairs. And the result is that the performance of NMT is much less effective for low-resource languages. To address this specific problem, in this paper, we describe a novel NMT model that is based on encoder-decoder architecture and relies on character-level inputs. Our proposed model employs Convolutional Neural Networks (CNN) and highway networks over character inputs, whose outputs are given to an encoder-decoder neural machine translation network. Besides, we also present two other approaches to improve the performance of the low-resource NMT system much further. First, we use language modeling implemented by denoising autoencoding to pre-train and initialize the full model. Second, we share the weights of the front few layers of two encoders between two languages to strengthen the encoding ability of the model. We demonstrate our model on two low-resource language pairs. On the IWSLT2015 English-Vietnamese translation task, our proposed model obtains improvements up to 2.5 BLEU points compared to the baseline. We also outperform the baseline approach more than 3 BLEU points on the CWMT2018 Chinese-Mongolian translation task.

Full Text