Abstract

Discourse markers are highly polyfunctional, particularly in spoken settings. Because of their syntactic optionality, they are often omitted in translations, especially in the restricted space of subtitles such as the parallel transcripts of TED Talks. In this study, we combine discourse annotation and translation spotting to investigate English discourse markers, focusing on their functions, omission and translation equivalents in Czech, French, Hungarian and Lithuanian. In particular, we study them through the lens of underspecification, of which we distinguish one monolingual and two multilingual types. After making an inventory of all discourse markers in the dataset, we zoom in on the three most frequent and, but and so. Our small-scale yet fine-grained corpus study suggests that the processes of underspecification are based on the semantics of discourse markers and are therefore shared cross-linguistically. However, not all discourse marker types nor their functions are equally affected by underspecification. Moreover, monolingual and multilingual underspecification do not always map for a particular marker. Beyond the empirical analysis of three highly frequent discourse markers in a sample of TED Talks, this study illustrates how translation and annotation can be combined to explore the multiple facets of underspecification in a monolingual and multilingual perspective.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call