Abstract
This article introduces Discourse Combinatory Categorial Grammar (DCCG) and shows how it can be used to generate multisentence paraphrases, flexibly incorporating both intra- and intersentential discourse connectives. DCCG employs a simple, practical approach to extending Combinatory Categorial Grammar (CCG) to encompass coverage of discourse-level phenomena, which furthermore makes it possible to generate clauses with multiple connectives and — in contrast to approaches based on Rhetorical Structure Theory — with rhetorical dependencies that do not form a tree. To do so, it borrows from Discourse Lexicalized Tree Adjoining Grammar (D-LTAG) the distinction between structural connectives and anaphoric discourse adverbials. Unlike D-LTAG, however, DCCG treats both sentential and discourse phenomena in the same grammar, rather than employing a separate discourse grammar. A key ingredient of this single-grammar approach is cue threading, a tightly constrained technique for extending the semantic scope of a discourse connective beyond the sentence. As DCCG requires no additions to the CCG formalism, it can be used to generate paraphrases of an entire dialogue turn using the OpenCCG realizer as-is, without the need to revise its architecture. In addition, from an interpretation perspective, a single grammar enables easier management of ambiguity across discourse and sentential levels using standard dynamic programming techniques, whereas D-LTAG has required a potentially complex interaction of sentential and discourse grammars to manage the same ambiguity. As a proof-of-concept, the article demonstrates how OpenCCG can be used with a DCCG to generate multi-sentence paraphrases that reproduce and extend those in the SPaRKy Restaurant Corpus.
Highlights
In this article, we introduce Discourse Combinatory Categorial Grammar (DCCG), a simple, practical approach to extending Combinatory Categorial Grammar (CCG; Steedman, 2000, Steedman and Baldridge, 2009) to encompass coverage of discourse-level phenomena
We have shown how a simple cue threading technique enables a lexicalized grammar such as CCG to be extended to handle structural discourse connectives — leaving discourse adverbials to be handled via anaphora resolution — without resorting to the use of two separate grammars, as in Discourse Lexicalized Tree Adjoining Grammar (D-Lexicalized Tree Adjoining Grammar (LTAG))
Currently we do not employ an ambiguity resolution module, which can result in a huge number of parses for a given comparison, or an anaphora resolution module, which results in underspecified reference nominals for anaphoric elements in the parse tree
Summary
We introduce Discourse Combinatory Categorial Grammar (DCCG), a simple, practical approach to extending Combinatory Categorial Grammar (CCG; Steedman, 2000, Steedman and Baldridge, 2009) to encompass coverage of discourse-level phenomena. DCCG treats both sentential and discourse phenomena in the same grammar, with no additions required to the CCG formalism In this way, DCCG can be used with existing CCG chart realization techniques (White, 2004, 2006a,b) to generate paraphrases that are not confined to single sentences, but rather allow for a flexible treatment of both intra- and inter-sentential discourse connectives (such as but or ). With this hand-crafted grammar, several hundred to several thousand multi-sentence paraphrases can typically be generated from a disjunctive logical form that compactly specifies the possible realizations These realizations range in quality from excellent to nearly unreadable, and a ranking model is needed to separate the wheat from the chaff.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.