Improving Performance of NMT Using Semantic Concept of WordNet Synset

Fangxu Liu,Gouyi Miao,Yufeng Chen,Yujie Zhang,Jinan Xu

doi:10.1007/978-981-13-3083-4_4

Abstract

Neural machine translation (NMT) has shown promising progress in recent years. However, for reducing the computational complexity, NMT typically needs to limit its vocabulary scale to a fixed or relatively acceptable size, which leads to the problem of rare word and out-of-vocabulary (OOV). In this paper, we present that the semantic concept information of word can help NMT learn better semantic representation of word and improve the translation accuracy. The key idea is to utilize the external semantic knowledge base WordNet to replace rare words and OOVs with their semantic concepts of WordNet synsets. More specifically, we propose two semantic similarity models to obtain the most similar concepts of rare words and OOVs. Experimental results on 4 translation tasks (We verify the effectiveness of our method on four translation tasks, including English-to- German, German-to-English, English-to-Chinese and Chinese-to-English.) show that our method outperforms the baseline RNNSearch by 2.38–2.88 BLEU points. Furthermore, the proposed hybrid method by combining BPE and our proposed method can also gain 0.39–0.97 BLEU points improvement over BPE. Experiments and analysis presented in this study also demonstrate that the proposed method can significantly improve translation quality of OOVs in NMT.

Full Text