Abstract
Machine translation based on neural networks has been shown to produce superior results, compared with other approaches. To build an efficient neural machine translation (NMT) system, it is essential to have an accurate and massive bilingual corpus for training, and ensure the continuous improvement of the methods and techniques used in the translation system. Despite multiple advantages, one challenging issue for current neural network translation system is long sentence processing [1]. In this paper, we propose a method to extract bilingual phrases to build a phrase-aligned bilingual corpus, and the implementation of a long sentence preprocessing technique to be used in the neural machine translation model. Experimental training of the neural machine translation system to translate Vietnamese into English using our proposed technique shows an improvement in BLEU scores.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have