Abstract

We present DefIE, an approach to large-scale Information Extraction (IE) based on a syntactic-semantic analysis of textual definitions. Given a large corpus of definitions we leverage syntactic dependencies to reduce data sparsity, then disambiguate the arguments and content words of the relation strings, and finally exploit the resulting information to organize the acquired relations hierarchically. The output of DefIE is a high-quality knowledge base consisting of several million automatically acquired semantic relations.

Highlights

  • The problem of knowledge acquisition lies at the core of Natural Language Processing

  • A more radical approach is adopted in systems like TEXTRUNNER (Etzioni et al, 2008) and REVERB (Fader et al, 2011), which developed from the Open Information Extraction (OIE) paradigm (Etzioni et al, 2008) and focused on the unconstrained extraction of a large number of relations from massive unstructured corpora

  • We presented DEFIE, an approach to OIE that, thanks to a novel unified syntactic-semantic analysis of text, harvests instances of semantic relations from a corpus of textual definitions

Read more

Summary

Introduction

The problem of knowledge acquisition lies at the core of Natural Language Processing. A more radical approach is adopted in systems like TEXTRUNNER (Etzioni et al, 2008) and REVERB (Fader et al, 2011), which developed from the Open Information Extraction (OIE) paradigm (Etzioni et al, 2008) and focused on the unconstrained extraction of a large number of relations from massive unstructured corpora. All these endeavors were geared towards addressing the knowledge acquisition problem and tackling long-standing challenges in the field, such as Machine Reading (Mitchell, 2005)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call