Evaluation of the impact of controlled language on neural machine translation compared to other MT architectures

Shaimaa Marzouk,Silvia Hansen-Schirra

doi:10.1007/s10590-019-09233-w

Abstract

Many studies have shown that the application of controlled languages (CL) is an effective pre-editing technique to improve machine translation (MT) output. In this paper, we investigate whether this also holds true for neural machine translation (NMT). We compare the impact of applying nine CL rules on the quality of NMT output as opposed to that of rule-based, statistical, and hybrid MT by applying three methods: error annotation, human evaluation, and automatic evaluation. The analyzed data is a German corpus-based test suite of technical texts that have been translated into English by five MT systems (a neural, a rule-based, a statistical, and two hybrid MT systems). The comparison is conducted in terms of several quantitative parameters (number of errors, error types, quality ratings, and automatic evaluation metrics scores). The results show that CL rules positively affect rule-based, statistical, and hybrid MT systems. However, CL does not improve the results of the NMT system. The output of the neural system is mostly error-free both before and after CL application and has the highest quality in both scenarios among the analyzed MT systems showing a decrease in quality after applying the CL rules. The qualitative discussion of the NMT output sheds light on the problems that CL causes for this kind of MT architecture.

Full Text