Abstract

The difference in word orders between source and target languages is a serious hurdle for machine translation. Preordering methods, which reorder the words in a source sentence before translation to obtain a similar word ordering with a target language, significantly improve the quality in statistical machine translation. While the information on the preordering position improved the translation quality in recurrent neural network-based models, questions such as how to use preordering information and whether it is helpful for the Transformer model remain unaddressed. In this article, we successfully employed preordering techniques in the Transformer-based neural machine translation. Specifically, we proposed a novel preordering encoding that exploits the reordering information of the source and target sentences as positional encoding in the Transformer model. Experimental results on ASPEC Japanese–English and WMT 2015 English–German, English–Czech, and English–Russian translation tasks confirmed that the proposed method significantly improved the translation quality evaluated by the BLEU scores of the Transformer model by ${\text{1.34}}$ points in the Japanese–to–English task, ${\text{2.19}}$ points in the English–to–German task, ${\text{0.15}}$ points in the Czech–to–English task, and ${\text {1.48}}$ points in the English–to–Russian task.

Highlights

  • T HE difference between the word orders in the source and target languages significantly influences the translation quality in statistical machine translation (SMT) [1]–[3]

  • Compared the recurrent neural network (RNN)-based models, the Transformer model provides a significantly improved translation quality

  • We used preordering methods based on bracketing transduction grammar (BTG) [3] and recursive neural network (RvNN) [6] because both models are state-of-the-art in SMT and optimized them for Kendall’s τ function (7)

Read more

Summary

INTRODUCTION

T HE difference between the word orders in the source and target languages significantly influences the translation quality in statistical machine translation (SMT) [1]–[3]. Zhao et al [7] exploited preordering index embeddings for a recurrent neural network (RNN) - based neural machine translation (NMT) model to improve the translation quality. Compared the RNN-based models, the Transformer model provides a significantly improved translation quality It cannot handle the order of the tokens because it calculates each token representation independently. It cannot consider the token orders for both the source and the target sentences simultaneously because these encodings are used separately on each side To exploit both the source and target order information in the Transformer model, we propose preordering encoding, which encodes the positions of preordered tokens using absolute [8] and relative encoding [9] approaches. KAWARA et al.: PREORDERING ENCODING ON TRANSFORMER FOR TRANSLATION for relative encoding and 1.48 BLEU points by preordering for absolute encoding on the English–to–Russian translation task

Preordering for SMT
Usage of Reordering Information in NMT
Transformer Model
Multi-Head Attention for Encoder
Absolute Encoding and Relative Encoding
Preordering Methods
Preordering Encoding
Corpus and Preprocessing
Training of Preordering Models
Training of NMT Models
Overall Results
ANALYSIS
Upper-Bound of Preordering Encoding
Relation Between Preordering and Translation Qualities
Effects to Under- and Over-Generation
Relation to Sentence Lengths
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call