A Study of Statistical Machine Translation Methods for Under Resourced Languages

Win Pa Pa,Ye Kyaw Thu,Andrew Finch,Eiichiro Sumita

doi:10.1016/j.procs.2016.04.057

Abstract

This paper contributes an empirical study of the application of ﬁve state-of-the-art machine translation to the trans- lation of low-resource languages. The methods studied were phrase-based, hierarchical phrase-based, the operational sequence model, string-to-tree, tree-to-string statistical machine translation methods between English (en) and the under resourced languages Lao (la), Myanmar (mm), Thai (th) in both directions. The performance of the machine translation systems was automatically measured in terms of BLEU and RIBES for all experiments. Our main ﬁndings were that the phrase-based SMT method generally gave the highest BLEU scores. This was counter to expectations, and we believe indicates that this method may be more robust to limitations on the data set size. However, when evaluated with RIBES, the best scores came from methods other than phrase-based SMT, indicating that the other methods were able to handle the word re-ordering better even under the constraint of limited data. Our study achieved the highest reported results on the data sets for all translation language pairs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Procedia Computer Science	Publication Date: Jan 1, 2016
Citations: 14	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

A Study of Statistical Machine Translation Methods for Under Resourced Languages

Abstract

Talk to us

Similar Papers

More From: Procedia Computer Science

Lead the way for us

Similar Papers

Modality-Preserving Phrase-Based Statistical Machine Translation
Masamichi Ideue ... Kazuhide Yamamoto
-
Masamichi Ideue, et. al.Masamichi Ideue ... Kazuhide Yamamoto
01 Nov 2012
01 Nov 2012

Toward Building a Comprehensive Phrase-based English-Arabic Statistical Machine Translation System
Sara Ebrahim ... Mostafa Mostafa
The Egyptian Journal of Language Engineering | VOL. 4
Sara Ebrahim, et. al.Sara Ebrahim ... Mostafa Mostafa
15 Sep 2017
The Egyptian Journal of Language Engineering | VOL. 4

Analysing terminology translation errors in statistical and neural machine translation
Rejwanul Haque ... Andy Way
Machine Translation | VOL. 34
Rejwanul Haque, et. al.Rejwanul Haque ... Andy Way
19 Aug 2020
Machine Translation | VOL. 34

End-to-End Neural Word Alignment Outperforms GIZA++
Thomas Zenkel ... Joern Wuebker
-
Thomas Zenkel, et. al.Thomas Zenkel ... Joern Wuebker
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Study of Statistical Machine Translation Methods for Under Resourced Languages

Abstract

Talk to us

Similar Papers

More From: Procedia Computer Science