Abstract

Knowledge bases (KBs) are essential for many downstream NLP tasks, yet their prime shortcoming is that they are often incomplete. State-of-the-art frameworks for KB completion often lack sufficient accuracy to work fully automated without human supervision. As a remedy, we propose : a novel interactive framework for KB completion from text based on a question answering pipeline. Our framework is tailored to the specific needs of a human-in-the-loop paradigm: (i) We generate facts that are aligned with text snippets and are thus immediately verifiable by humans. (ii) Our system is designed such that it continuously learns during the KB completion task and, therefore, significantly improves its performance upon initial zero- and few-shot relations over time. (iii) We only trigger human interactions when there is enough information for a correct prediction. Therefore, we train our system with negative examples and a fold-option if there is no answer. Our framework yields a favorable performance: it achieves a hit@1 ratio of 29.7% for initially unseen relations, upon which it gradually improves to 46.2%.

Highlights

  • Knowledge bases (KBs) present databases that store information about entities and the relations among them

  • We propose IntKB: a novel interactive framework for KB completion based on question answering (QA)

  • We developed a combination of a neural QA pipeline and a novel training framework with fact alignment and contionus updates

Read more

Summary

Introduction

Knowledge bases (KBs) present databases that store information about entities and the relations among them. Knowledge base completion is often designed as an automated task where predictions are directly integrated into an existing but incomplete KB. This can be problematic for various reasons: (i) The performance of state-of-the-art systems is not of sufficient accuracy, so that it allows for an automatic integration of new facts (Akrami et al, 2020). In IntKB, new candidate facts are presented to human annotators, who can approve or reject the candidates before they are integrated into the KB Based on this setting, we overcome several challenges that are inherent to prior work: First, candidate predictions need to be human verifiable. A live demo is available from https://wikidatacomplete.org, the source code is available from https://github.com/bernhard2202/intkb

Related Work
Preliminaries
IntKB: A Verifiable Framework for Interactive KB Completion
Task Definition
Answer Triggering
Entity Linking
Cold-Start
Continuous Improvement from User Interactions
Dataset
Computational Experiments
Performance on “Known” Subset
Performance on “Zero-Shot” Subset
Continuous Learning from User Interactions
Performance on Answer Back-Linking
Discussion
Findings
A Details on Dataset Preprocessing
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call