Abstract
We present DefIE, an approach to large-scale Information Extraction (IE) based on a syntactic-semantic analysis of textual definitions. Given a large corpus of definitions we leverage syntactic dependencies to reduce data sparsity, then disambiguate the arguments and content words of the relation strings, and finally exploit the resulting information to organize the acquired relations hierarchically. The output of DefIE is a high-quality knowledge base consisting of several million automatically acquired semantic relations.
Highlights
The problem of knowledge acquisition lies at the core of Natural Language Processing
A more radical approach is adopted in systems like TEXTRUNNER (Etzioni et al, 2008) and REVERB (Fader et al, 2011), which developed from the Open Information Extraction (OIE) paradigm (Etzioni et al, 2008) and focused on the unconstrained extraction of a large number of relations from massive unstructured corpora
We presented DEFIE, an approach to OIE that, thanks to a novel unified syntactic-semantic analysis of text, harvests instances of semantic relations from a corpus of textual definitions
Summary
The problem of knowledge acquisition lies at the core of Natural Language Processing. A more radical approach is adopted in systems like TEXTRUNNER (Etzioni et al, 2008) and REVERB (Fader et al, 2011), which developed from the Open Information Extraction (OIE) paradigm (Etzioni et al, 2008) and focused on the unconstrained extraction of a large number of relations from massive unstructured corpora. All these endeavors were geared towards addressing the knowledge acquisition problem and tackling long-standing challenges in the field, such as Machine Reading (Mitchell, 2005)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have