Abstract

The problem of populating an ontology consists in adding to it some new, domain-specific content from an input expressed, in particular, in a natural language. We focus on an important aspect in the ontology population process – finding and resolving coreferences, i.e., similar mentions of entities in the input text. Our contribution is a novel formal framework that extends the state-of-the-art approaches to coreference resolution by using multiple semantic similarity properties in the resolution process, i.e., we extend the list of the ontological properties used for coreference resolution with additional properties such as inverse, symmetry, intersection, union, etc. We use the proposed framework to improve our previously proposed algorithm for coreference resolution used in our general approach to text analysis and information extraction for populating subject domain ontologies. We describe a multi-agent implementation of our information extraction system and we show that using additional semantic similarity measures for evaluating coreferential candidates improves the quality of the coreference resolution process, especially for complex objects whose coreferencing has not been yet studied in detail.

Highlights

  • The process of ontology population is the actively studied problem of adding new instances of concepts to the ontology

  • The solution for the ontology population task is interrelated with the elaboration of natural language processing (NLP) techniques applied in the process of information extraction (IE), with coreference resolution as one of the most challenging NLP tasks

  • Our main contribution in this paper is a formal framework for coreference resolution in the process of ontology population

Read more

Summary

INTRODUCTION

The process of ontology population is the actively studied problem of adding new instances of concepts to the ontology. We use the proposed framework to improve our coreference resolution algorithm suggested in [6] for making the decision on the candidate admissibility, which is used in our general approach to text analysis and information extraction for populating subject domain ontology. Afterwards a semantic coreference algorithm runs on the RDF graph to revise the results of the textbased step: instances are merged if they belong to the same class in the domain ontology and their string similarity is higher than a predefined threshold These approaches to coreference resolution provide insufficient completeness, in particular, due to the poor use of the features of ontology classes and relations.

BASIC DEFINITIONS
THE SEMANTIC MEASURE OF COREFERENCE SIMILARITY
A MULTI-AGENT APPROACH TO COREFERENCE RESOLUTION IN THE
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.