Anaphora Resolution with the ARRAU Corpus

Massimo Poesio,Nafise Moosavi,Olga Uryupina,Juntao Yu,Adam Roussel,Ina Roesiger,Yulia Grishina,Fabian Simonjetz,Heike Zinsmeister,Varada Kolhatkar,Alexandra Uma

doi:10.18653/v1/w18-0702

Abstract

The ARRAU corpus is an anaphorically annotated corpus of English providing rich linguistic information about anaphora resolution. The most distinctive feature of the corpus is the annotation of a wide range of anaphoric relations, including bridging references and discourse deixis in addition to identity (coreference). Other distinctive features include treating all NPs as markables, including non-referring NPs; and the annotation of a variety of morphosyntactic and semantic mention and entity attributes, including the genericity status of the entities referred to by markables. The corpus however has not been extensively used for anaphora resolution research so far. In this paper, we discuss three datasets extracted from the ARRAU corpus to support the three subtasks of the CRAC 2018 Shared Task–identity anaphora resolution over ARRAU-style markables, bridging references resolution, and discourse deixis; the evaluation scripts assessing system performance on those datasets; and preliminary results on these three tasks that may serve as baseline for subsequent research in these phenomena.

Highlights

The release of the ONTONOTES coreference corpus (Pradhan et al, 2007a) and the organization of two CONLL shared tasks based on the dataset (Pradhan et al, 2012) have resulted in a substantial increase in coreference research, both in terms of quantity and in terms of quality
A simple form of discourse deixis, event anaphora, is annotated in ONTONOTES; bridging reference was not annotated, a subset of the corpus has been annotated with this information by Markert et al (2012)
Marasovicet al. (2017) developed an approach to abstract anaphora resolution based on bidirectional LSTMs to produce representations of the anaphor and the candidate sentence, and a mention ranking component adapted from the systems by Clark and Manning (2016) and Wiseman et al (2015)

Summary

Introduction

The release of the ONTONOTES coreference corpus (Pradhan et al, 2007a) and the organization of two CONLL shared tasks based on the dataset (Pradhan et al, 2012) have resulted in a substantial increase in coreference research, both in terms of quantity and in terms of quality. Anaphora resolution involves a number of phenomena besides ‘coreference’, such as bridging reference (Clark, 1975) and discourse deixis (Webber, 1991). A simple form of discourse deixis, event anaphora, is annotated in ONTONOTES; bridging reference was not annotated, a subset of the corpus has been annotated with this information by Markert et al (2012). In ARRAU, all NPs are considered markables, including expletives and singletons. Both discourse deixis and bridging reference have been annotated. There are a number of reasons for this, ranging from the fact that research in both bridging reference and discourse deixis is still limited, to the unusual markup format. Our hope is that making such datasets available may, on the one hand, facilitate the use of ARRAU; on the other, increase the community of researchers working on these aspects of anaphora resolution

Genres

Markables

Types of anaphoric relations marked

Two releases

Markup

Identity anaphora

Discourse Deixis

The Three Tasks of CRAC 2018

Markable Settings

Task 1

Task 2

Task 3

Markable extraction

Conclusions

A Appendix

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Anaphora Resolution with the ARRAU Corpus

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2018
Citations: 70	License type: cc-by

Similar Papers

Scoring Coreference Chains with Split-Antecedent Anaphors
Silviu Paun ... Juntao Yu
Dialogue & Discourse | VOL. 14
Silviu Paun, et. al.Silviu Paun ... Juntao Yu
28 Sep 2023
Dialogue & Discourse | VOL. 14

A comprehensive review on feature set used for anaphora resolution
Kusum Lata ... Pardeep Singh
Artificial Intelligence Review | VOL. 54
Kusum Lata, et. al.Kusum Lata ... Pardeep Singh
14 Oct 2020
Artificial Intelligence Review | VOL. 54

Knowledge-based Direct Anaphora Resolution for Definitive Pronouns
Jesús Alexander Alvarado Gutiérrez ... Alexander Gelbukh
Computación y Sistemas | VOL. 25
Jesús Alexander Alvarado Gutiérrez, et. al.Jesús Alexander Alvarado Gutiérrez ... Alexander Gelbukh
03 May 2021
Computación y Sistemas | VOL. 25

Arabic Anaphora Resolution: Corpus of the Holy Qur’an Annotated with Anaphoric Information
Ali Farghaly ... Khadiga M
International Journal of Computer Applications | VOL. 124
Ali Farghaly, et. al.Ali Farghaly ... Khadiga M
18 Aug 2015
International Journal of Computer Applications | VOL. 124

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Anaphora Resolution with the ARRAU Corpus

Abstract

Highlights

Summary

Talk to us

Similar Papers