Hybrid Attention for Chinese Character-Level Neural Machine Translation

Feng Wang,Wei Chen,Zhen Yang,Shuang Xu,Bo Xu

doi:10.1016/j.neucom.2019.05.032

Abstract

This paper proposes a novel character-level neural machine translation model which can effectively improve the Neural Machine Translation (NMT) by fusing word and character attention information. In our work, the bidirectional Gated Recurrent Unit (GRU) network is utilized to compose word-level information from the input sequence of characters automatically. Contrary to traditional NMT models, two kinds of different attentions are incorporated into our proposed model: One is the character-level attention which pays attention to the original input characters; The other is the word-level attention which pays attention to the automatically composed words. With the two attentions, the model is able to encode the information from the character level and word level simultaneously. We find that the composed word-level information is compatible and complementary to the original input character-level information. Experimental results on Chinese-English translation tasks show that the proposed model can offer a boost of up to +1.92 BLEU points over the traditional word based NMT models. Furthermore, our translation performance is also comparable to the latest outstanding models, including the state-of-the-art.

Full Text