QUES: A Quality Estimation System of Arabic to English Translation

Manar Salamah Ali,Najwa Noorwali,Anfal Alatawi,Bayader Alsahafi

doi:10.14569/ijacsa.2020.0110732

Abstract

Estimating translation quality is a problem of growing importance as it has many potential applications. The quality of translation from Arabic to English is especially difficult to evaluate due to the languages being distant languages: different in syntax and low in lexical similarity. We propose a feature-based framework for estimating the quality of Arabic to English translations at the sentence level. The proposed method works without reference translations, considers both fluency and adequacy of translations, and does not imply assumptions on the source of translation (humans, machines, or post-edited machine translations); thus, making the solution applicable to increasingly more situations. This research solves the translation quality estimation problem by treating it as a supervised machine learning problem. The proposed model utilizes regression algorithms (SVR and Linear Regression) to predict quality scores of unseen translated texts at runtime. This is accomplished by training models on a labeled parallel corpus and mapping extracted features to the quality label. The prediction models succeeded in predicting fluency and adequacy of translations with a Mean Absolute Error of 0.84 and 1.02, respectively. Furthermore, we show that in a similar setting of our approach, fluency of an Arabic to English translated sentence on its own, is an appropriate indication of a translation’s overall quality.

Full Text