Experiences on the Improvement of Logic-Based Anaphora Resolution in English Texts

Stefano Ferilli,Domenico Redavid

doi:10.3390/electronics11030372

Stefano Ferilli, Domenico Redavid

Open Access

https://doi.org/10.3390/electronics11030372

Copy DOI

Abstract

Anaphora resolution is a crucial task for information extraction. Syntax-based approaches are based on the syntactic structure of sentences. Knowledge-poor approaches aim at avoiding the need for further external resources or knowledge to carry out their task. This paper proposes a knowledge-poor, syntax-based approach to anaphora resolution in English texts. Our approach improves the traditional algorithm that is considered the standard baseline for comparison in the literature. Its most relevant contributions are in its ability to handle differently different kinds of anaphoras, and to disambiguate alternate associations using gender recognition of proper nouns. The former is obtained by refining the rules in the baseline algorithm, while the latter is obtained using a machine learning approach. Experimental results on a standard benchmark dataset used in the literature show that our approach can significantly improve the performance over the standard baseline algorithm used in the literature, and compares well also to the state-of-the-art algorithm that thoroughly exploits external knowledge. It is also efficient. Thus, we propose to use our algorithm as the new baseline in the literature.

Highlights

The current wide availability and continuous increase of digital documents, especially in textual form, makes it impossible to manually process them, except for a few selected and very important ones
Going beyond ‘simple’ information retrieval, typically based on some kind of lexical indexing of the texts, trying to understand a text’s content and distilling it so as to provide it to end users or to make it available for further automated processing is the task of the information extraction field of research, e.g., among other objectives, it would be extremely relevant and useful to be able to automatically extract the facts and relationships expressed in the text and formalize them into a knowledge base that can subsequently be consulted for many different purposes: answering queries whose answer is explicitly reported in the knowledge base, carrying out formal reasoning that infers information not explicitly reported in the knowledge base, etc
Expressing, respectively, the ratio of correct answers among the answers given and the ratio of correct answers over the real set of correct answers, in terms of parameters TP (True Positives, the number of items correctly retrieved), FP (False Positives, the number of items wrongly retrieved), FN (False Negatives, the number of items wrongly discarded) and TN (True Negatives, the number of items correctly discarded). These metrics require all correct answers for the dataset to be known, and they ignore the fact that, in Anaphora Resolution (AR), the queries themselves are not known in advance but the system itself is in charge of identifying the anaphoras

Summary

Introduction

The current wide availability and continuous increase of digital documents, especially in textual form, makes it impossible to manually process them, except for a few selected and very important ones. A cataphora (from Greek ‘carrying down’) is in some sense the ‘opposite’ of an anaphora [8]: whilst the latter references an entity located earlier in the text, the former references an entity that will be mentioned later in the discourse (typically in the same sentence). This kind of reference is more frequent in poetry, but can be found in common language.

Basics and Related Work

Anaphora and Anaphora Resolution

Pronominal Anaphora

Noun Phrases

Other Anaphoric References

Anaphora Resolution Algorithms

Hobbs’ Naïve Algorithm

Liang and Wu’s Approach

Evaluation

Proposed Algorithm

Gender Recognition

Implementation and Experimental Results

Gender Prediction

Anaphora Resolution Effectiveness and Efficiency

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronics	Publication Date: Jan 26, 2022
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Experiences on the Improvement of Logic-Based Anaphora Resolution in English Texts

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

A Hybrid Approach to Pronominal Anaphora Resolution in Arabic
Abdullatif Abolohom ... Nazlia Omar
Journal of Computer Science | VOL. 11
Abdullatif Abolohom, et. al.Abdullatif Abolohom ... Nazlia Omar
01 May 2015
Journal of Computer Science | VOL. 11

A Comparative Study of Linguistic and Computational Features Based on a Machine Learning for Arabic Anaphora Resolution
Abdullatif Abolohom ... João Cordeiro
Procedia computer science | VOL. 189
Abdullatif Abolohom, et. al.Abdullatif Abolohom ... João Cordeiro
01 Jan 2020
Procedia computer science | VOL. 189

A Machine Learning Approach to Anaphora Resolution in Arabic
Abdullatif Abolohom ... Nazlia Omar
International Review on Computers and Software (IRECOS) | VOL. 9
Abdullatif Abolohom, et. al.Abdullatif Abolohom ... Nazlia Omar
31 Dec 2015
International Review on Computers and Software (IRECOS) | VOL. 9

A Novel Text – Mining System for Generating Abstract from Extracted Summaries Using Anaphora Resolution
Ayyalu Hariharan Nandhu Kishore ... Mohan Saravanan
-
Ayyalu Hariharan Nandhu Kishore, et. al.Ayyalu Hariharan Nandhu Kishore ... Mohan Saravanan
01 Jan 2012
01 Jan 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Experiences on the Improvement of Logic-Based Anaphora Resolution in English Texts

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics