Abstract

Post-editing has become an important part not only of translation research but also in the global translation industry. While computer-aided translation tools, such as translation memories, are considered to be part of a translator's work, lately, machine translation (MT) systems have also been accepted by human translators. However, many human translators are still adopting the changes brought by translation technologies to the translation industry. This paper introduces a novel approach for seeking suitable pairs of n-grams when recommending n-grams (corresponding n-grams between MT and post-edited MT) based on the type of text (manual or administrative) and MT system used for machine translation. A tool that recommends and speeds up the correction of MT was developed to help the post-editors with their work. It is based on the analysis of words with the same lemmas and analysis of n-gram recommendations. These recommendations are extracted from sequence patterns of the mismatched words (MisMatch) between MT output and post-edited MT output. The paper aims to show the usage of morphological analysis for recommending the post-edit operations. It describes the usage of mismatched words in the n-gram recommendations for the post-edited MT output. The contribution consists of the methodology for seeking suitable pairs of words, n-grams and additionally the importance of taking into account metadata (the type of the text and/or style and MT system) when recommending post-edited operations.

Highlights

  • Tasks that were exclusively based on human thinking and intelligence are gradually starting to be operated by machines

  • The following assumptions were stated: (i) It is expected that machine translation (MT) output translated by Google Translate (GT), will have a significant impact on the quantity of extracted rules

  • (ii) It is expected that MT output translated by GT, will have a significant impact on decreasing the portion of rules. (iii) It is expected that the style/type of text, will have a significant impact on the quantity of extracted rules

Read more

Summary

Introduction

Tasks that were exclusively based on human thinking and intelligence are gradually starting to be operated by machines. Post-editing (PE) constitutes a significant element of the translation industry [1]. Its situation and quality can be improved with the improvement of. The possibility to escape from the routine and from dull translations, and the ability to use technology while translating technical or administrative documents undoubtedly increases translation productivity in multilingual societies—societies opting for rapid and high-quality translation services [3,4,5]. The core of CAT tools lies in translation memory (TM); the program which stores parallel text aligned into segments (e.g. popular Trados, Memsource or MemoQ). A segment does not necessarily equal a sentence as a unit, sometimes it is just a phrase. The parallel text consists of the original and its translation usually aligned 1 to 1

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.