A product and process analysis of post-editor corrections on neural, statistical and rule-based machine translation output

Maarit Koponen,Markku Nikulin,Leena Salmi

doi:10.1007/s10590-019-09228-7

Maarit Koponen, Markku Nikulin + Show 1 more

Open Access

https://doi.org/10.1007/s10590-019-09228-7

Copy DOI

Journal: Computers and Translation	Publication Date: Mar 8, 2019
Citations: 22	License type: open-access

Affiliation: University of Turku

Abstract

This paper presents a comparison of post-editing (PE) changes performed on English-to-Finnish neural (NMT), rule-based (RBMT) and statistical machine translation (SMT) output, combining a product-based and a process-based approach. A total of 33 translation students acted as participants in a PE experiment providing both post-edited texts and edit process data. Our product-based analysis of the post-edited texts shows statistically significant differences in the distribution of edit types between machine translation systems. Deletions were the most common edit type for the RBMT, insertions for the SMT, and word form changes as well as word substitutions for the NMT system. The results also show significant differences in the correctness and necessity of the edits, particularly in the form of a large number of unnecessary edits in the RBMT output. Problems related to certain verb forms and ambiguity were observed for NMT and SMT, while RBMT was more likely to handle them correctly. Process-based comparison of effort indicators shows a slight increase of keystrokes per word for NMT output, and a slight decrease in average pause length for NMT compared to RBMT and SMT in specific text blocks. A statistically significant difference was observed in the number of visits per sub-segment, which is lower for NMT than for RBMT and SMT. The results suggest that although different types of edits were needed to outputs from NMT, RBMT and SMT systems, the difference is not necessarily reflected in process-based effort indicators.

Highlights

Recent developments in neural machine translation (NMT) and reported quality improvements over phrase-based statistical machine translation (SMT) have led to much excitement
Based on the analysis of PE changes identified in the final version produced by each participant, the distribution of different edit types was compared in NMT, rule-based MT (RBMT) and SMT versions
Compared to SMT, the NMT output contained fewer omissions and fewer word form errors, but a larger number of extra words

Summary

Introduction

Recent developments in neural machine translation (NMT) and reported quality improvements over phrase-based statistical machine translation (SMT) have led to much excitement. NMT systems have outperformed other types in recent studies and evaluation campaigns in many language pairs. Recent error analyses comparing NMT to SMT systems, suggest that NMT produces more fluent output in morphologically rich languages (Toral and SánchezCartagena 2017; Klubička et al 2017, 2018). While some recent studies comparing NMT and SMT systems suggest that NMT produces fewer word form errors, offering potential improvements for morphologically rich languages, the comparative lack of resources still poses issues in the case of Finnish. Like Castilho et al (2017), report mixed results using different automatic (HTER, BLEU) and human evaluation metrics (fluency, adequacy) – SMT systems outperformed NMT in two case studies out of three – and point out that results vary depending on domain and language pair

Objectives

Results

Discussion

Conclusion