AbstractSince September 2016, when Google implemented the neural machine translation (NMT) technique in Google Translate (GNMT), the overall quality of machine translation (MT) has improved significantly, narrowing the gap between MT and human translation (HT). One detail aspect of MT quality is how the system can cope with the translation of proper names and numbers, so-called Named Entities (NEs). This paper outlines the principles underlying the GNMT translation of NEs and analyzes form changes including errors that are encountered during NMT-based translation and the causes of those form changes. First, on the basis of Wu et al. (2016), a simple model for GNMT-based NE translation is created, and within this model NE translations in 60 target texts are analyzed and compared to those of a professional translator. According to this model, GNMT-derived NE-form changes can roughly be attributed to four causes: segmentation, memory, features of training corpora, and functionality in the attention system.
Read full abstract