Neural Machine Translation by Minimising the Bayes-risk with Respect to Syntactic Translation Lattices

Felix Stahlberg,Adrià De Gispert,Bill Byrne,Eva Hasler

doi:10.18653/v1/e17-2058

Abstract

We present a novel scheme to combine neural machine translation (NMT) with traditional statistical machine translation (SMT). Our approach borrows ideas from linearised lattice minimum Bayes-risk decoding for SMT. The NMT score is combined with the Bayes-risk of the translation according the SMT lattice. This makes our approach much more flexible than n-best list or lattice rescoring as the neural decoder is not restricted to the SMT search space. We show an efficient and simple way to integrate risk estimation into the NMT decoder which is suitable for word-level as well as subword-unit-level NMT. We test our method on English-German and Japanese-English and report significant gains over lattice rescoring on several data sets for both single and ensembled NMT. The MBR decoder produces entirely new hypotheses far beyond simply rescoring the SMT search space or fixing UNKs in the NMT output.

Highlights

Lattice minimum Bayes-risk (LMBR) decoding has been applied successfully to translation lattices in traditional statistical machine translation (SMT) to improve translation performance of a single system (Kumar and Byrne, 2004; Tromble et al, 2008; Blackwood et al, 2010)
We show how to reformulate the original LMBR decision rule for using it in a word-based neural machine translation (NMT) decoder which is not restricted to an n-best list or a lattice
We propose to collect statistics for MBR from a potentially large translation lattice generated with SMT, and use the n-gram posteriors as additional score in NMT decoding

Summary

Introduction

Lattice minimum Bayes-risk (LMBR) decoding has been applied successfully to translation lattices in traditional SMT to improve translation performance of a single system (Kumar and Byrne, 2004; Tromble et al, 2008; Blackwood et al, 2010). Minimum Bayes-risk (MBR) decoding is a very powerful framework for combining diverse systems (Sim et al, 2007; de Gispert et al, 2009). We study combining traditional SMT and NMT in a hybrid decoding scheme based on MBR. We argue that MBR-based methods in their present form are not well-suited for NMT because of the following reasons:. NMT decoding usually relies on beam search with a limited beam and produces very narrow lattices (Li and Jurafsky, 2016; Vijayakumar et al, 2016). It is difficult to collect the statistics needed for risk calculation for NMT

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Neural Machine Translation by Minimising the Bayes-risk with Respect to Syntactic Translation Lattices

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2017
Citations: 62	License type: cc-by

Similar Papers

Multilingual Neural Translation

-

14 Feb 2020
14 Feb 2020

Improving neural machine translation through phrase-based soft forced decoding
Jingyi Zhang ... Satoshi Nakamura
Machine Translation | VOL. 34
Jingyi Zhang, et. al.Jingyi Zhang ... Satoshi Nakamura
01 Apr 2020
Machine Translation | VOL. 34

Preventing translation quality deterioration caused by beam search decoding in neural machine translation using statistical machine translation
Emre Satir ... Hasan Bulut
Information Sciences | VOL. 581
Emre Satir, et. al.Emre Satir ... Hasan Bulut
06 Oct 2021
Information Sciences | VOL. 581

Adaptation in Statistical Machine Translation for Low-resource Domains in English-Vietnamese Language
Nghia-Luan Pham ... Van-Vinh Nguyen
VNU Journal of Science: Computer Science and Communication Engineering | VOL. 36
Nghia-Luan Pham, et. al.Nghia-Luan Pham ... Van-Vinh Nguyen
30 May 2020
VNU Journal of Science: Computer Science and Communication Engineering | VOL. 36

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Neural Machine Translation by Minimising the Bayes-risk with Respect to Syntactic Translation Lattices

Abstract

Highlights

Summary

Talk to us

Similar Papers