This paper compares translations done by professional human translators (as manifested in the Babel English-Chinese Parallel Corpus) and machine translations produced by Google Translate (with English as the source language and Simplified Chinese as the target language). Capitalizing on the branching direction of these two languages, we investigated translation strategies of restrictive relative clauses by focusing on which-relatives. The specific objectives of the study were to (a) extract relevant data from a parallel corpus; (b) assess machine translation quality; and (c) highlight the similarities and dissimilarities between corpus outputs and machine translation outputs. Of the 147 test materials, 115 Google Translate outputs (78.2%) were rated as ‘successful’ or ‘acceptable’. No significant differences were found between the degree of success in machine translations and linguistic factors (e.g., the active or passive voice in relative clauses). This finding confirms that linguistic knowledge is not required when using statistical machine translation (SMT) such as Google Translate. Another noteworthy finding was that Google Translate did not use the least frequently occurring translation strategy in the parallel corpus. This is not surprising given that SMT systems greatly rely on parallel corpora for training statistical translation models.
Read full abstract