Abstract We investigate how discourse relations and their subtypes are signalled, extending the set of discourse signals from connectives and lexical cue phrases to the wide range of semantic, syntactic, and orthographic signals of the RST Signalling Corpus (Das, Debopam & Maite Taboada. 2018. RST signalling corpus. Language Resources and Evaluation 52. 149–184). This extension requires re-evaluating previous predictions on discourse signalling, in particular, those of Sanders, Ted. 2005. Coherence, causality and cognitive complexity in discourse. In M. Aurnague, M. Bras, A. Le Draoulec & L. Vieu (eds.), Proceedings/Actes SEM-05, first international symposium on the exploration and modelling of meaning, 105–114. Biarritz causality-by-default hypothesis, the hypothesis of uniform information density (Frank, Austin & Florian Jaeger. 2008. Speaking rationally: Uniform information density as an optimal strategy for language production. In Proceedings of the 30th annual meeting of the Cognitive Science Society, 933–938. https://escholarship.org/uc/item/7d08h6j4 (accessed 18 May 2022)), and the hypothesis that discourse is continuous by preference (Segal, Erwin, Judith Duchan & Paula Scott. 1991. The role of interclausal connectives in narrative structuring. Discourse Processes 14. 27–54; Murray, John. 1997. Connectives and narrative text. Memory and Cognition 25. 227–236). We evaluate the predictions of these theories on the conditional relations in the RST Discourse Treebank (Carlson, Lynn, Daniel Marcu & Mary Ellen Okurowski. 2002. RST Discourse Treebank. LDC2002T07. Philadelphia: Linguistic Data Consortium), using causal relations as a control group. Informativity and continuity are operationalized in terms of semantic complexity and Givón, Talmy. 1993. English grammar: A function-based introduction, vol. 2. Amsterdam: John Benjamins dimensions of deictic shift. Our results show that the hypotheses make accurate predictions only for the relation groups in their entirety but not for the observed in-group variation, in particular, the low amount of marking for the hypothetical subtype of conditional relations. We attribute this difference to the distribution of intra- and inter-sentential occurrences across the conditional subtypes: intra-sentential relations are consistently more marked than inter-sentential ones, and hypothetical relations are special in that they appear predominantly inter-sententially.
Read full abstract