Abstract
Abstract There is still much to learn about the ways in which human and machine translation differ with regard to the contexts that regulate the production and interpretation of discourse. The present study explores whether a corpus-driven lexical analysis of human and machine translation can unveil discourse features that set the two apart. A balanced corpus of source texts aligned with authentic, professional translations and neural machine translations was compiled for the study. Lexical discrepancies in the two translation corpora were then extracted via a corpus-driven keyword analysis, and examined qualitatively through parallel concordances of source texts aligned with human and machine translation. The study shows that keyword analysis not only reiterates known problems of discourse in machine translation such as lexical inconsistency and pronoun resolution, but can also provide valuable insights regarding contextual aspects of translated discourse deserving further research.
Highlights
Writers are often given mixed messages with respect to word choice
As the analyses presented in this paper are concerned with single documents and their translations, the per term, per document HerfindahlHirschman Index (HHI) scores are sufficient
The trend is that consistency is irrelevant in translating light verbs, rare verbs tend to be translated with the highest consistency, and mid-range verbs are somewhere in between
Summary
Writers are often given mixed messages with respect to word choice. On one hand they are encouraged to vary their use of words (in essay writing): “It is important that the words you use are varied, so that you aren’t using the same words again and again.”. On the other hand they are encouraged to use the same words (only changing the determiner) when referring to the same entity a second time (in technical writing): “The first time a single countable noun is introduced, use a. Halliday and Hassan (1976) showed that wellwritten documents exhibit lexical cohesion in terms of what they call reiteration and collocation. A collocation is a sequence of words / terms that co-occur regularly in text.
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have