Abstract

Describing implicit phenomena in discourse is known to be a problematic task, from both theoretical and empirical perspectives. The present article contributes to this topic by a novel comparative analysis of two prominent annotation approaches to discourse relations (coherence relations) that were carried out on the same texts. We compare the annotation of implicit relations in the Penn Discourse Treebank 2.0, i.e. discourse relations not signaled by an explicit discourse connective, to the recently released analysis of signals of rhetorical relations in the RST Signalling Corpus (RST-SC). The intersection of corresponding pairs of relations is rather a small one, but it shows a clear tendency: unlike the overall signal distribution in the RST-SC, more than half of the signals in the studied intersection are of semantic type, formed mostly by loosely defined lexical chains. Our data transformation allows for a simultaneous depiction and detailed study of the two resources.

Highlights

  • Introduction and MotivationIn recent discourse-oriented research, great attention has been paid to discourse markers or discourse connectives1, which are agreed to be the most apparent anchors of discourse relations, and in this way substantially contribute to discourse coherence

  • We focus on the Penn Discourse Treebank 2.0 (PDTB, Prasad et al (2008b)) annotation of implicit discourse relations, look for their counterparts in the RST Signalling Corpus (RST-SC, Das et al (2015)) and analyze coherence signals assigned to these relations in the RST signalling annotation

  • It is only natural that a high fraction of the semantic signals in implicit relations, namely those annotated as lexical chains, are very difficult to assess, and for the annotators to agree on

Read more

Summary

Introduction

Introduction and MotivationIn recent discourse-oriented research, great attention has been paid to discourse markers or discourse connectives, which are agreed to be the most apparent anchors of discourse relations, and in this way substantially contribute to discourse coherence. A natural step in the research on discourse coherence is to answer the general question of how coherence is established if such a connective device is not present between given text segments. We may want to describe what other overtly present language. POLÁKOVÁ, MÍROVSKÝ AND SYNKOVÁ elements, even elements not directly associated with discourse coherence functions, can play a role in our understanding of such a connection. We may be interested in the way our mind processes such a connection (let us say, an implicit relation), why we all (mostly) understand and interpret a given text in the same way and what part of the overall meaning is inferred from a co-textual, situational or world knowledge context. From the perspective of automatic text processing, we may want to model discourse coherence with the help of detecting the same signals/features a human normally uses for full comprehension of texts

Objectives
Methods
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call