Comparative Analysis of Current Approaches to Quality Estimation for Neural Machine Translation

Sugyeong Eo,Jaehyung Seo,Hyeonseok Moon,Heuiseok Lim,Chanjun Park

doi:10.3390/app11146584

Abstract

Quality estimation (QE) has recently gained increasing interest as it can predict the quality of machine translation results without a reference translation. QE is an annual shared task at the Conference on Machine Translation (WMT), and most recent studies have applied the multilingual pretrained language model (mPLM) to address this task. Recent studies have focused on the performance improvement of this task using data augmentation with finetuning based on a large-scale mPLM. In this study, we eliminate the effects of data augmentation and conduct a pure performance comparison between various mPLMs. Separate from the recent performance-driven QE research involved in competitions addressing a shared task, we utilize the comparison for sub-tasks from WMT20 and identify an optimal mPLM. Moreover, we demonstrate QE using the multilingual BART model, which has not yet been utilized, and conduct comparative experiments and analyses with cross-lingual language models (XLMs), multilingual BERT, and XLM-RoBERTa.

Highlights

Unlike other previous studies that mostly utilize the SOTA model, we remove the effects of data augmentation that are utilized to achieve performance improvement and perform a comparative study between representative multilingual pretrained language model (mPLM) based on sub-tasks 1 and 2 from WMT20
To the best of our knowledge, we are the first to conduct such research; Through a comparative analysis concerning how to construct an appropriate input structure for Quality estimation (QE), we reveal that the performance can be improved by changing the input order of the source sentence and the machine translation (MT) output; In the process of finetuning mPLMs, we only use data officially distributed in WMT20 and use the official test set to ensure objectivity for all experiments
English–German was used as the language pair for this experiment, and performance comparisons were conducted for each mPLM at sub-task 1 and sub-task 2 sentence-levels

Summary

Introduction

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. Previous studies have used the input structure of [BOS] Source sentence [EOS] [EOS] MT output [EOS] or [BOS] MT output [EOS] [EOS] Source sentence [EOS] without a clear standard We investigate this process through a quantitative analysis by utilizing different input structures for all mPLMs. The contributions of this study are as follows: We conduct comparative experiments on finetuning mPLMs for a QE task, which is different from research concerning the performance improvement of the WMT sharedtask competition. To the best of our knowledge, we are the first to conduct such research; Through a comparative analysis concerning how to construct an appropriate input structure for QE, we reveal that the performance can be improved by changing the input order of the source sentence and the MT output; In the process of finetuning mPLMs, we only use data officially distributed in WMT20. (without external knowledge or data augmentation) and use the official test set to ensure objectivity for all experiments

Related Work and Background

Multilingual BERT

Cross-Lingual Language Model

XLM-RoBERTa

Multilingual BART

Sub-Task 1

Sub-Task 2

Dataset Details

Model Details

Revisiting the QE Input Structure

Experimental Results for Question 2

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Jul 17, 2021
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Comparative Analysis of Current Approaches to Quality Estimation for Neural Machine Translation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

Word-Level Quality Estimation for Korean-English Neural Machine Translation
Sugyeong Eo ... Jaehyung Seo
IEEE Access | VOL. 10
Sugyeong Eo, et. al.Sugyeong Eo ... Jaehyung Seo
01 Jan 2021
IEEE Access | VOL. 10

Improving Cross-lingual Information Retrieval on Low-Resource Languages via Optimal Transport Distillation
Zhiqi Huang ... James Allan
-
Zhiqi Huang, et. al.Zhiqi Huang ... James Allan
27 Feb 2023
27 Feb 2023

Improving Pre-Trained Multilingual Model with Vocabulary Expansion
Hai Wang ... Dian Yu
-
Hai Wang, et. al.Hai Wang ... Dian Yu
01 Jan 2019
01 Jan 2019

Translation Quality Estimation by Jointly Learning to Score and Rank
Jingyi Zhang ... Josef Van Genabith
-
Jingyi Zhang, et. al.Jingyi Zhang ... Josef Van Genabith
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparative Analysis of Current Approaches to Quality Estimation for Neural Machine Translation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences