Abstract

Corpus-based machine translation (MT) has been the main approach to developing and implementing MT systems in both academia and the industry over the last three decades. In this field, the type and size of the corpus used for training MT engines have presented problems for both statistical MT (SMT) systems as well as neural MT (NMT) systems, being the two dominant corpusbased approaches. Moreover, language pairs such as Turkish-English have been understudied within this framework. This article aims to evaluate the translation quality in Turkish-to-English custom MT systems that have been trained on different corpus sizes and types. Two NMT engines and two SMT engines were trained on the KantanMT platform using two different training corpus types with either only domain-specific cardiology corpus or this corpus plus a mixed-domain corpus. The study conducted both automatic evaluations with metrics including BLEU, F-Measure and TER, as well as a comprehensive human evaluation with metrics including fluency, A/B test, and adequacy. Lastly, the study realized a separate, subjective terminology evaluation in order to investigate how differently MT systems handle terminology, as this is a crucial aspect for specific-domain text types such as cardiology. While the automatic evaluation results suggest the SMT engines to perform better than NMT engines, all human evaluators rated the mixed-domain NMT engine as the highest performing one. However, the terminology evaluation task demonstrated SMT to still be able to perform better and to commit less terminology errors, despite the industry and academia shifting toward NMT engines.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call