Abstract
This paper describes a corpus of situated multiparty chats developed for the STAC project (Strategic Conversation, ERC grant n. 269427). and annotated for discourse structure in the style of Segmented Discourse Representation Theory (SDRT; Asher & Lascarides,2003). The STAC corpus is not only a rich source of data on strategic conversation, but also the first corpus that we are aware of that provides discourse structures for multiparty dialogues situated within a virtual environment. The corpus was annotated in two stages: we initially annotated the chat moves only, but later decided to annotate interactions between the chat moves and non-linguistic events from the virtual environment. This two-step procedure has allowed us quantify various ways in which adding information from the nonlinguistic context affects dialogue structure. In this paper, we look at how annotations based only on linguistic information were preserved once the nonlinguistic context was factored in. We explain that while the preservation of relation instances is relatively high when we move from one corpus to the other, there is little preservation of higher order structures that capture "the main point" of a dialogue and distinguish it from peripheral information.
Highlights
The study of discourse structure, in particular rhetorical structure, on texts is a well entrenched cottage industry in computational linguistics
Considering the nature of the situated annotations and how they compare with the chat-only annotations allowed us to illuminate and measure a variety of ways in which information from the nonlinguistic context can influence the content and structure of a discourse
Our study provides new data and statistics to substantiate claims from Hunter et al (2018) that modelling discourse situated in a dynamically evolving nonlinguistic context requires attributing a rich structure to that context, and that this structure can interact with purely linguistic structures in new and interesting ways
Summary
The study of discourse structure, in particular rhetorical structure, on texts is a well entrenched cottage industry in computational linguistics. Both chat moves and game events contribute arguments to rhetorical relations, which allows us to account for the flexible dynamics of natural discourse situations. The analysis developed in this paper goes considerably beyond models of the nonlinguistic context in terms of deixis and reference, in which nonlinguistic entities are understood as crucial for the interpretation of discourse structures, but not for their construction (Kaplan, 1989; Rickheit and Wachsmuth, 2006; Kranstedt et al, 2004; Kruijff et al, 2010) It extends Tenbrink et al (2013)’s work on the NavSpace corpus, in which nonlinguistic actions are treated as contributing dialogue acts but the topic of their contributions to overall rhetorical structures is not broached. This comparison gives a fuller picture of the role of nonlinguistic structures in the overall interpretation of our situated interactions
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have