Abstract

This paper describes a corpus of situated multiparty chats developed for the STAC project (Strategic Conversation, ERC grant n. 269427). and annotated for discourse structure in the style of Segmented Discourse Representation Theory (SDRT; Asher & Lascarides,2003). The STAC corpus is not only a rich source of data on strategic conversation, but also the first corpus that we are aware of that provides discourse structures for multiparty dialogues situated within a virtual environment. The corpus was annotated in two stages: we initially annotated the chat moves only, but later decided to annotate interactions between the chat moves and non-linguistic events from the virtual environment. This two-step procedure has allowed us quantify various ways in which adding information from the nonlinguistic context affects dialogue structure. In this paper, we look at how annotations based only on linguistic information were preserved once the nonlinguistic context was factored in. We explain that while the preservation of relation instances is relatively high when we move from one corpus to the other, there is little preservation of higher order structures that capture "the main point" of a dialogue and distinguish it from peripheral information.

Highlights

  • The study of discourse structure, in particular rhetorical structure, on texts is a well entrenched cottage industry in computational linguistics

  • Considering the nature of the situated annotations and how they compare with the chat-only annotations allowed us to illuminate and measure a variety of ways in which information from the nonlinguistic context can influence the content and structure of a discourse

  • Our study provides new data and statistics to substantiate claims from Hunter et al (2018) that modelling discourse situated in a dynamically evolving nonlinguistic context requires attributing a rich structure to that context, and that this structure can interact with purely linguistic structures in new and interesting ways

Read more

Summary

Introduction

The study of discourse structure, in particular rhetorical structure, on texts is a well entrenched cottage industry in computational linguistics. Both chat moves and game events contribute arguments to rhetorical relations, which allows us to account for the flexible dynamics of natural discourse situations. The analysis developed in this paper goes considerably beyond models of the nonlinguistic context in terms of deixis and reference, in which nonlinguistic entities are understood as crucial for the interpretation of discourse structures, but not for their construction (Kaplan, 1989; Rickheit and Wachsmuth, 2006; Kranstedt et al, 2004; Kruijff et al, 2010) It extends Tenbrink et al (2013)’s work on the NavSpace corpus, in which nonlinguistic actions are treated as contributing dialogue acts but the topic of their contributions to overall rhetorical structures is not broached. This comparison gives a fuller picture of the role of nonlinguistic structures in the overall interpretation of our situated interactions

The Settlers corpus
Moving to situated dialogue
The conceptualization and structure of game events
RELATIONS RELEVANT FOR THE INTERPRETATION OF THE SITUATED ANNOTATIONS
COMPLEX STRUCTURES RELEVANT FOR THE SITUATED ANNOTATIONS
57 Alternation 1
Preservation of linguistic structure
Preservation of substructures
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call