Abstract

Whereas post-edited texts have been shown to be either of comparable quality to human translations or better, one study shows that people still seem to prefer human-translated texts. The idea of texts being inherently different despite being of high quality is not new. Translated texts, for example, are also different from original texts, a phenomenon referred to as ‘Translationese’. Research into Translationese has shown that, whereas humans cannot distinguish between translated and original text, computers have been trained to detect Translationese successfully. It remains to be seen whether the same can be done for what we call Post-editese. We first establish whether humans are capable of distinguishing post-edited texts from human translations, and then establish whether it is possible to build a supervised machine-learning model that can distinguish between translated and post-edited text.

Highlights

  • In our increasingly multicultural society, choices need to be made regarding translation production and quality

  • Based on our training data, we calculated Information Gain (IG), Gain Ratio (GR) and chi-squared. These values can be interpreted as feature weights and ranked according to the amount of information they add to discriminating between the two possible labels: PE versus human translations (HT)

  • From the results we observe that all three statistics more or less agree on which features are most discriminative; these are indicated in italics

Read more

Summary

Introduction

In our increasingly multicultural society, choices need to be made regarding translation production and quality. Research has shown that post-edited (PE) texts are often judged to be of comparable quality to human translations (HT) (Fiederer & O’Brien, 2009; Garcia, 2010; O’Curran, 2014; Plitt & Masselot, 2010) and even of better quality than HTs (Green, 2013; Koponen, 2016) These quality judgements are usually performed by language experts or researchers with a background in linguistics. A comparable study was performed by Bowker and Buitrago Ciro (2015) with Spanish-speaking immigrants in Canada They presented readers with different versions of a text (HT, maximally PE, rapidly PE, raw MT) and asked them which text they preferred. The respondents chose the HT version of a text in 42% of the cases, compared to

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call