Robustness in Coreference Resolution

Nafise Sadat Moosavi

doi:10.11588/heidok.00027919

Abstract

Coreference resolution is the task of determining different expressions of a text that refer to the same entity. The resolution of coreferring expressions is an essential step for automatic interpretation of the text. While coreference information is beneficial for various NLP tasks like summarization, question answering, and information extraction, state-of-the-art coreference resolvers are barely used in any of these tasks. The problem is the lack of robustness in coreference resolution systems. A coreference resolver that gets higher scores on the standard evaluation set does not necessarily perform better than the others on a new test set. In this thesis, we introduce robustness in coreference resolution by (1) introducing a reliable evaluation framework for recognizing robust improvements, and (2) proposing a solution that results in robust coreference resolvers. As the first step of setting up the evaluation framework, we introduce a reliable evaluation metric, called LEA, that overcomes the drawbacks of the existing metrics. We analyze LEA based on various types of errors in coreference outputs and show that it results in reliable scores. In addition to an evaluation metric, we also introduce an evaluation setting in which we disentangle coreference evaluations from parsing complexities. Coreference resolution is affected by parsing complexities for detecting the boundaries of expressions that have complex syntactic structures. We reduce the effect of parsing errors in coreference evaluation by automatically extracting a minimum span for each expression. We then emphasize the importance of out-of-domain evaluations and generalization in coreference resolution and discuss the reasons behind the poor generalization of state-of-the-art coreference resolvers. Finally, we show that enhancing state-of-the-art coreference resolvers with linguistic features is a promising approach for making coreference resolvers robust across domains. The incorporation of linguistic features with all their values does not improve the performance. However, we introduce an efficient pattern mining approach, called EPM, that mines all feature-value combinations that are discriminative for coreference relations. We then only incorporate feature-values that are discriminative for coreference relations. By employing EPM feature-values, performance improves significantly across various domains.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Robustness in Coreference Resolution

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Incremental Coreference Resolution for German

-

01 Jan 2015
01 Jan 2015

Coreference Resolution Using Verbs Knowledge
...
-
, et. al. ...
10 Jul 2017
10 Jul 2017

Bio-SCoRes: A Smorgasbord Architecture for Coreference Resolution in Biomedical Text.
Halil Kilicoglu ... Dina Demner-Fushman
PloS one | VOL. 11
Halil Kilicoglu, et. al.Halil Kilicoglu ... Dina Demner-Fushman
02 Mar 2016
PloS one | VOL. 11

Coreference annotation and resolution in the Colorado Richly Annotated Full Text (CRAFT) corpus of biomedical journal articles
K Bretonnel Cohen ... William A Baumgartner
BMC Bioinformatics | VOL. 18
K Bretonnel Cohen, et. al.K Bretonnel Cohen ... William A Baumgartner
17 Aug 2017
BMC Bioinformatics | VOL. 18

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Robustness in Coreference Resolution

Abstract

Talk to us

Similar Papers