Abstract

The problem of rare and unknown words is an important issue in Uyghur-Chinese machine translation, especially using neural machine translation model. We propose a novel way to deal with the rare and unknown words. Based on neural machine translation of using pointers over input sequence, our approach which consists of preprocess and post-process can be used in all neural machine translation model. Pre-process modify the Uyghur-Chinese corpus to extend the ability of pointer network, and the post- process retranslating the raw translation by a phrase-based machine translation model or a wordlist. Experiment show that neural machine translation model used the approach proposed by this paper get a higher BLEU score than the phrase-based model in Uyghur-Chinese MT.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call