Automatic machine translation error identification

Débora Beatriz De Jesus Martins,Helena De Medeiros Caseli

doi:10.1007/s10590-014-9163-y

Abstract

Although machine translation (MT) has been an object of study for decades now, the texts generated by the state-of-the-art MT systems still present several errors for many language pairs. Aiming at coping with this drawback, lots of efforts have been made to post-edit those errors either manually or automatically. Manual post-editing is more accurate but can be prohibitive when too many changes have to be made. Automatic post-editing demands less effort but can also be less effective and give rise to new errors. A way to avoid unnecessary automatic post-editing and new errors is by previously selecting only the machine-translated segments that really need to be post-edited. Thus, this paper describes the experiments carried out to automatically identify MT errors generated by a state-of-the-art phrase-based statistical MT system. Despite the fact that our experiments have been carried out using a statistical MT engine, we believe the approach can also be applied to other types of MT systems. The experiments investigated the well-known machine-learning algorithms Naive Bayes, Decision Trees and Support Vector Machines. Using the decision tree algorithm it was possible to identify wrong segments with around 77 % precision and recall when a small training corpus of only 2,147 error instances was used. Our experiments were performed on English-to-Brazilian Portuguese MT, and although some of the features are language-dependent, the proposed approach is language-independent and can be easily generalized to other language pairs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Automatic machine translation error identification

Abstract

Talk to us

Similar Papers

More From: Machine Translation

Lead the way for us

Journal: Machine Translation	Publication Date: Nov 28, 2014
Citations: 26

Similar Papers

Using Statistical Machine Translation to Grade Training Data
Andrew Finch ... Eiichiro Sumita
-
Andrew Finch, et. al.Andrew Finch ... Eiichiro Sumita
01 Dec 2008
01 Dec 2008

Training, Enhancing, Evaluating and Using MT Systems with Comparable Data
Bogdan Babych ... Mateja Verlic
-
Bogdan Babych, et. al.Bogdan Babych ... Mateja Verlic
01 Jan 2019
01 Jan 2019

Symbolic-to-statistical hybridization: extending generation-heavy machine translation
Nizar Habash ... Christof Monz
Machine Translation | VOL. 23
Nizar Habash, et. al.Nizar Habash ... Christof Monz
01 Feb 2009
Machine Translation | VOL. 23

Hybrid data-driven models of machine translation
Declan Groves ... Andy Way
Machine Translation | VOL. 19
Declan Groves, et. al.Declan Groves ... Andy Way
02 Nov 2006
Machine Translation | VOL. 19

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automatic machine translation error identification

Abstract

Talk to us

Similar Papers

More From: Machine Translation