Phrase Dependency Machine Translation with Quasi-Synchronous Tree-to-Tree Features

Kevin Gimpel,Noah A Smith

doi:10.1162/coli_a_00175

Abstract

Recent research has shown clear improvement in translation quality by exploiting linguistic syntax for either the source or target language. However, when using syntax for both languages (“tree-to-tree” translation), there is evidence that syntactic divergence can hamper the extraction of useful rules (Ding and Palmer 2005 ). Smith and Eisner ( 2006 ) introduced quasi-synchronous grammar, a formalism that treats non-isomorphic structure softly using features rather than hard constraints. Although a natural fit for translation modeling, its flexibility has proved challenging for building real-world systems. In this article, we present a tree-to-tree machine translation system inspired by quasi-synchronous grammar. The core of our approach is a new model that combines phrases and dependency syntax, integrating the advantages of phrase-based and syntax-based translation. We report statistically significant improvements over a phrase-based baseline on five of seven test sets across four language pairs. We also present encouraging preliminary results on the use of unsupervised dependency parsing for syntax-based machine translation.

Highlights

Building translation systems for many language pairs requires addressing a wide range of translation divergence phenomena
We present a statistical tree-to-tree machine translation system inspired by quasi-synchronous grammar
All quasi-synchronous phrase dependency (QPD) results are significantly better than all Moses baseline results, but there is no significant difference between the two QPD feature sets

Summary

Introduction

Building translation systems for many language pairs requires addressing a wide range of translation divergence phenomena. Many have incorporated linguistic syntax into translation model design. The availability of these parsers, and gains in their accuracy, triggered research interest in syntax-based statistical machine translation (Yamada and Knight 2001). We use boldface for vectors and we denote individual elements in vectors using subscripts; for example, the source and target sentences are denoted x = x1, . We denote the set containing the first k positive integers as [k].

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computational Linguistics	Publication Date: Jun 1, 2014
Citations: 65	License type: cc-by

R Discovery Prime

R Discovery Prime

Phrase Dependency Machine Translation with Quasi-Synchronous Tree-to-Tree Features

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computational Linguistics

Lead the way for us

Similar Papers

Evaluating translation quality as input to product development
...
-
, et. al. ...
01 May 2000
01 May 2000

Machine Translation of Noun Phrases from Arabic to English Using Transfer-Based Approach
Shirko
Journal of Computer Science | VOL. 6
Shirko Shirko
01 Mar 2010
Journal of Computer Science | VOL. 6

Syntactic and Structural Divergence in English-to-Marathi Machine Translation
S.B Kulkarni ... K.V Kale
-
S.B Kulkarni, et. al.S.B Kulkarni ... K.V Kale
01 Aug 2013
01 Aug 2013

Baidu Translate: Research and Products
Zhongjun He
-
Zhongjun HeZhongjun He
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Phrase Dependency Machine Translation with Quasi-Synchronous Tree-to-Tree Features

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computational Linguistics