Abstract

This paper proposes a novel character-level neural machine translation model which can effectively improve the Neural Machine Translation (NMT) by fusing word and character attention information. In our work, the bidirectional Gated Recurrent Unit (GRU) network is utilized to compose word-level information from the input sequence of characters automatically. Contrary to traditional NMT models, two kinds of different attentions are incorporated into our proposed model: One is the character-level attention which pays attention to the original input characters; The other is the word-level attention which pays attention to the automatically composed words. With the two attentions, the model is able to encode the information from the character level and word level simultaneously. We find that the composed word-level information is compatible and complementary to the original input character-level information. Experimental results on Chinese-English translation tasks show that the proposed model can offer a boost of up to +1.92 BLEU points over the traditional word based NMT models. Furthermore, our translation performance is also comparable to the latest outstanding models, including the state-of-the-art.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.