Abstract

Various knowledge bases (KBs) have been constructed via information extraction from encyclopedias, text and tables, as well as alignment of multiple sources. Their usefulness and usability is often limited by quality issues. One common issue is the presence of erroneous assertions and alignments, often caused by lexical or semantic confusion. We study the problem of correcting such assertions and alignments, and present a general correction framework which combines lexical matching, context-aware sub-KB extraction, semantic embedding, soft constraint mining and semantic consistency checking. The framework is evaluated with one set of literal assertions from DBpedia, one set of entity assertions from an enterprise medical KB, and one set of mapping assertions from a music KB constructed by integrating Wikidata, Discogs and MusicBrainz. It has achieved promising results, with a correction rate (i.e., the ratio of the target assertions/alignments that are corrected with right substitutes) of 70.1 %, 60.9 % and 71.8 %, respectively.

Highlights

  • Knowledge Bases (KBs) whose variants are often known as Knowledge Graphs [22] are playing an increasingly important role in applications such as search engines, question answering, common sense reasoning and data integration

  • C KB whose TBox is defined by clinic experts and ABox is extracted from medical articles by some open information extraction tools, and (iii) mapping assertions in a music KB that is constructed

  • We find that filtering with either assertion prediction (AP) or constraint-based validation (CV) can improve the correction rate

Read more

Summary

Introduction

Knowledge Bases (KBs) whose variants are often known as Knowledge Graphs [22] are playing an increasingly important role in applications such as search engines, question answering, common sense reasoning and data integration They include general purpose KBs such as Wikidata [60], DBpedia [2] and NELL [38], as well as domain specific KBs such as Discogs and MusicBrainz. Chen et al / An assertion and alignment correction framework for large scale knowledge bases knowledge engineering [65], and they often include knowledge from multiple sources that has been integrated via some alignment procedure [6,67] Notwithstanding their important role, these KBs still suffer from various quality issues, including constraint violations and erroneous assertions [15,46], that negatively impact their usefulness and usability. It may use a more expressive language such as

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call