Optimizing Instance Selection for Statistical Machine Translation with Feature Decay Algorithms

Ergun Bicici,Deniz Yuret

doi:10.1109/taslp.2014.2381882

Abstract

We introduce FDA5 for efficient parameterization, optimization, and implementation of feature decay algorithms (FDA), a class of instance selection algorithms that use feature decay. FDA increase the diversity of the selected training set by devaluing features (i.e., n-grams) that have already been included. FDA5 decides which instances to select based on three functions used for initializing and decaying feature values and scaling sentence scores controlled with five parameters. We present optimization techniques that allow FDA5 to adapt these functions to in-domain and out-of-domain translation tasks for different language pairs. In a transductive learning setting, selection of training instances relevant to the test set can improve the final translation quality. In machine translation experiments performed on the 2 million sentence English-German section of the Europarl corpus, we show that a subset of the training set selected by FDA5 can gain up to 3.22 BLEU points compared to a randomly selected subset of the same size, can gain up to 0.41 BLEU points compared to using all of the available training data using only 15% of it, and can reach within 0.5 BLEU points to the full training set result by using only 2.7% of the full training data. FDA5 peaks at around 8M words or 15% of the full training set. In an active learning setting, FDA5 minimizes the human effort by identifying the most informative sentences for translation and FDA gains up to 0.45 BLEU points using 3/5 of the available training data compared to using all of it and 1.12 BLEU points compared to random training set. In translation tasks involving English and Turkish, a morphologically rich language, FDA5 can gain up to 11.52 BLEU points compared to a randomly selected subset of the same size, can achieve the same BLEU score using as little as 4% of the data compared to random instance selection, and can exceed the full dataset result by 0.78 BLEU points. FDA5 is able to reduce the time to build a statistical machine translation system to about half with 1M words using only 3% of the space for the phrase table and 8% of the overall space when compared with a baseline system using all of the training data available yet still obtain only 0.58 BLEU points difference with the baseline system in out-of-domain translation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Optimizing Instance Selection for Statistical Machine Translation with Feature Decay Algorithms

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing

Lead the way for us

Journal: IEEE/ACM Transactions on Audio, Speech, and Language Processing	Publication Date: Feb 1, 2015
Citations: 63

Similar Papers

Adaptation in Statistical Machine Translation for Low-resource Domains in English-Vietnamese Language
Nghia-Luan Pham ... Van-Vinh Nguyen
VNU Journal of Science: Computer Science and Communication Engineering | VOL. 36
Nghia-Luan Pham, et. al.Nghia-Luan Pham ... Van-Vinh Nguyen
30 May 2020
VNU Journal of Science: Computer Science and Communication Engineering | VOL. 36

Using Statistical Machine Translation to Grade Training Data
Andrew Finch ... Eiichiro Sumita
-
Andrew Finch, et. al.Andrew Finch ... Eiichiro Sumita
01 Dec 2008
01 Dec 2008

A novel and robust approach for pro-drop language translation
Longyue Wang ... Andy Way
Machine Translation | VOL. 31
Longyue Wang, et. al.Longyue Wang ... Andy Way
13 Jan 2017
Machine Translation | VOL. 31

ParFDA for Instance Selection for Statistical Machine Translation
Ergun Bicici
-
Ergun BiciciErgun Bicici
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Optimizing Instance Selection for Statistical Machine Translation with Feature Decay Algorithms

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing