Abstract

The study and annotation of discourse markers (DMs) in the context of translation is a much needed and challenging task not only for descriptive translation studies, but also for Natural Language Processing (NLP) applications. Their various meanings are difficult to identify and annotate, even for trained human experts. In this chapter, a methodology for the analysis and annotation of DMs is proposed, using three highly frequent DMs in English -in fact, actually and really- and their translations into Spanish as a case study. The methodology consists of an initial corpus analysis phase followed by a corpus annotation phase. The corpus analysis provides qualitative and quantitative information on the meanings of these DMs by looking at their translations in large parallel corpora. The corpus annotation phase specifies the annotation procedure, which can be generalized to other DMs and to other language pairs, and form the basis for large-scale cross-linguistic annotation of DMs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call