Abstract

Prior researches suggest that neural machine translation (NMT) captures word alignment through its attention mechanism, however, this paper finds attention may almost fail to capture word alignment for some NMT models. This paper thereby proposes two methods to induce word alignment which are general and agnostic to specific NMT models. Experiments show that both methods induce much better word alignment than attention. This paper further visualizes the translation through the word alignment induced by NMT. In particular, it analyzes the effect of alignment errors on translation errors at word level and its quantitative analysis over many testing examples consistently demonstrate that alignment errors are likely to lead to translation errors measured by different metrics.

Highlights

  • Machine translation aims at modeling the semantic equivalence between a pair of source and target sentences (Koehn, 2009), and word alignment tries to model the semantic equivalence between a pair of source and target words (Och and Ney, 2003)

  • This paper makes the two-fold contributions:. It systematically studies word alignment from neural machine translation (NMT) and proposes two approaches to induce word alignment which are agnostic to specific NMT models

  • In order to induce word alignment from general NMT models, we propose two different methods, which are agnostic to specific NMT models

Read more

Summary

Introduction

Machine translation aims at modeling the semantic equivalence between a pair of source and target sentences (Koehn, 2009), and word alignment tries to model the semantic equivalence between a pair of source and target words (Och and Ney, 2003). Our experiments demonstrate that NMT captures good word alignment for those words mostly contributed from source (CFS), while their word alignment is much worse for those words mostly contributed from target (CFT). This finding offers a reason why advanced NMT models delivering excellent translation capture worse word alignment than statistical aligners in SMT, which was observed in prior researches yet without deep explanation (Tu et al, 2016; Liu et al, 2016)

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call