Investigating the Relationship between Classification Quality and SMT Performance in Discriminative Reordering Models

Arefeh Kazemi,Andy Way,Mohammadali Nematbakhsh,Amirhassan Monadjemi,Antonio Toral

doi:10.3390/e19090340

Abstract

Reordering is one of the most important factors affecting the quality of the output in statistical machine translation (SMT). A considerable number of approaches that proposed addressing the reordering problem are discriminative reordering models (DRM). The core component of the DRMs is a classifier which tries to predict the correct word order of the sentence. Unfortunately, the relationship between classification quality and ultimate SMT performance has not been investigated to date. Understanding this relationship will allow researchers to select the classifier that results in the best possible MT quality. It might be assumed that there is a monotonic relationship between classification quality and SMT performance, i.e., any improvement in classification performance will be monotonically reflected in overall SMT quality. In this paper, we experimentally show that this assumption does not always hold, i.e., an improvement in classification performance might actually degrade the quality of an SMT system, from the point of view of MT automatic evaluation metrics. However, we show that if the improvement in the classification performance is high enough, we can expect the SMT quality to improve as well. In addition to this, we show that there is a negative relationship between classification accuracy and SMT performance in imbalanced parallel corpora. For these types of corpora, we provide evidence that, for the evaluation of the classifier, macro-averaged metrics such as macro-averaged F-measure are better suited than accuracy, the metric commonly used to date.

Highlights

Statistical Machine Translation (SMT) systems automatically translate from one natural language into another
We study the relationship between the performance of the reordering classifier and SMT quality in three parallel corpora from different language pairs, and experimentally show that this assumption does not always hold
We investigate the relationship between classification performance and SMT quality, and provide some guidelines for intrinsic evaluation of the classification performance in SMT

Summary

Introduction

Statistical Machine Translation (SMT) systems automatically translate from one natural language into another. To the best of our knowledge the relationship between classification quality and SMT performance has not been studied to date. It might be assumed that improvements in classifier quality will be monotonically reflected in overall SMT performance. This is the assumption that justifies previous work which tries to find the best classifier for an SMT system, based solely on the classifier quality metrics [2,3,4]. We study the relationship between the performance of the reordering classifier and SMT quality in three parallel corpora from different language pairs, and experimentally show that this assumption does not always hold

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Investigating the Relationship between Classification Quality and SMT Performance in Discriminative Reordering Models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy

Lead the way for us

Journal: Entropy	Publication Date: Aug 24, 2017
License type: CC BY 4.0

Similar Papers

Measuring domain similarity for statistical machine translation
Lin Liu ... Tiejun Zhao
-
Lin Liu, et. al. Lin Liu ... Tiejun Zhao
01 Jul 2013
01 Jul 2013

Offline Corpus Augmentation for English-Amharic Machine Translation
Yohannes Biadgligne ... Kamel Smaili
-
Yohannes Biadgligne, et. al.Yohannes Biadgligne ... Kamel Smaili
01 Mar 2022
01 Mar 2022

Word sense disambiguation for statistical machine translation
Marine Jacinthe Carpuat
-
Marine Jacinthe CarpuatMarine Jacinthe Carpuat
23 Dec 2014
23 Dec 2014

Comparing example-based and statistical machine translation
Andy Way ... Nano Gough
Natural Language Engineering | VOL. 11
Andy Way, et. al.Andy Way ... Nano Gough
21 Sep 2005
Natural Language Engineering | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Investigating the Relationship between Classification Quality and SMT Performance in Discriminative Reordering Models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy