A Polynomial-Time Dynamic Programming Algorithm for Phrase-Based Decoding with a Fixed Distortion Limit

Yin-Wen Chang,Michael Collins

doi:10.1162/tacl_a_00046

Abstract

Decoding of phrase-based translation models in the general case is known to be NP-complete, by a reduction from the traveling salesman problem (Knight, 1999). In practice, phrase-based systems often impose a hard distortion limit that limits the movement of phrases during translation. However, the impact on complexity after imposing such a constraint is not well studied. In this paper, we describe a dynamic programming algorithm for phrase-based decoding with a fixed distortion limit. The runtime of the algorithm is O( nd! lh d+1) where n is the sentence length, d is the distortion limit, l is a bound on the number of phrases starting at any position in the sentence, and h is related to the maximum number of target language translations for any source word. The algorithm makes use of a novel representation that gives a new perspective on decoding of phrase-based models.

Highlights

Phrase-based translation models (Koehn et al, 2003; Och and Ney, 2004) are widely used in statistical machine translation
This paper describes an algorithm for phrasebased decoding with a fixed distortion limit whose runtime is linear in the length of the sentence, and for a fixed distortion limit is polynomial in other factors
The algorithm builds on the insight that decoding with a hard distortion limit is related to the bandwidth-limited traveling salesman problem (BTSP) (Lawler et al, 1985)

Summary

Introduction

Phrase-based translation models (Koehn et al, 2003; Och and Ney, 2004) are widely used in statistical machine translation. The complexity of decoding with such a distortion limit is an open question: the NP-hardness result from Knight. For a hard distortion limit d, and sentence length n, the runtime is O(nd!lhd+1), where l is a bound on the number of phrases starting at any point in the sentence, and h is related to the maximum number of translations for any word in the source language sentence. The algorithm builds on the insight that decoding with a hard distortion limit is related to the bandwidth-limited traveling salesman problem (BTSP) (Lawler et al, 1985). The algorithm is amenable to beam search It is quite different from previous methods for decoding of phrase-based models, potentially opening up a very different way of thinking about decoding algorithms for phrasebased models, or more generally for models in statistical NLP that involve reordering

Related Work

Background

Bandwidth-Limited TSPPs

An Algorithm for Bandwidth-Limited TSPPs

Basic Definitions

Discussion

Beam Search

Complexity of Decoding with Bit-string Representations

Conclusion

A Proof of Lemma 4

B Proof of Lemma 5

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Transactions of the Association for Computational Linguistics	Publication Date: Dec 1, 2017
Citations: 15	License type: cc-by

R Discovery Prime

R Discovery Prime

A Polynomial-Time Dynamic Programming Algorithm for Phrase-Based Decoding with a Fixed Distortion Limit

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Transactions of the Association for Computational Linguistics

Lead the way for us

Similar Papers

Adjacent reordering phrase-based translation models
Li Yujian ... Li Yanpeng
-
Li Yujian, et. al. Li Yujian ... Li Yanpeng
01 Aug 2010
01 Aug 2010

New Bounds on Integrality Gaps by Constructing Convex Combinations

-

10 Jul 2020
10 Jul 2020

Improving neural machine translation through phrase-based soft forced decoding
Jingyi Zhang ... Eiichro Sumita
Machine Translation | VOL. 34
Jingyi Zhang, et. al.Jingyi Zhang ... Eiichro Sumita
01 Apr 2020
Machine Translation | VOL. 34

Phrase-Based Statistical Language Modeling from Bilingual Parallel Corpus
Jun Mao ... Gang Cheng
-
Jun Mao, et. al.Jun Mao ... Gang Cheng
07 Apr 2007
07 Apr 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Polynomial-Time Dynamic Programming Algorithm for Phrase-Based Decoding with a Fixed Distortion Limit

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Transactions of the Association for Computational Linguistics