Canonicalizing Open Knowledge Bases

Luis Galárraga,Kevin Murphy,Geremy Heitz,Fabian M Suchanek

doi:10.1145/2661829.2662073

Canonicalizing Open Knowledge Bases

Luis Galárraga, Kevin Murphy + Show 2 more

Open Access

https://doi.org/10.1145/2661829.2662073

Copy DOI

Publication Date: Nov 3, 2014

Citations: 129

Affiliation: Télécom Paris, Google (United States)

#Open Information Extraction Approaches #Open IE + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Open information extraction approaches have led to the creation of large knowledge bases from the Web. The problem with such methods is that their entities and relations are not canonicalized, leading to redundant and ambiguous facts. For example, they may store {Barack Obama, was born, Honolulu and {Obama, place of birth, Honolulu}. In this paper, we present an approach based on machine learning methods that can canonicalize such Open IE triples, by clustering synonymous names and phrases. We also provide a detailed discussion about the different signals, features and design choices that influence the quality of synonym resolution for noun phrases in Open IE KBs, thus shedding light on the middle ground between open and closed information extraction systems.

Full Text