Enhanced Neural Machine Translation by Joint Decoding with Word and POS-tagging Sequences

Xiaocheng Feng,Wanlong Zhao,Bing Qin,Zhangyin Feng,Ting Liu

doi:10.1007/s11036-020-01582-8

Abstract

Machine translation has become an irreplaceable application in the use of mobile phones. However, the current mainstream neural machine translation models depend on continuously increasing the amount of parameters to achieve better performance, which is not applicable to the mobile phone. In this paper, we improve the performance of neural machine translation (NMT) with shallow syntax (e.g., POS tag) of target language, which has better accuracy and latency than deep syntax such as dependency parsing. In particular, our models take less parameters and runtime than other complex machine translation models, making mobile applications possible. In detail, we present three RNN-based NMT decoding models (independent decoder, gates shared decoder and fully shared decoder) to jointly predict target word and POS tag sequences. Experiments on Chinese-English and German-English translation tasks show that the fully shared decoder can acquire the best performance, which increases the BLEU score by 1.4 and 2.25 points respectively compared with the attention-based NMT model. In addition, we extend the idea to transformer-based models, and the experimental results also show that the BLEU score is further improved.

Full Text