Using Weak Supervision to Identify Long-Tail Entities for Knowledge Base Completion

Yaser Oulabi,Christian Bizer

doi:10.1007/978-3-030-33220-4_7

Abstract

Data from relational web tables can be used to augment cross-domain knowledge bases like DBpedia, Wikidata, or the Google Knowledge Graph with descriptions of entities that are not yet part of the knowledge base. Such long-tail entities can include for instance small villages, niche songs, or athletes that play in lower-level leagues. In previous work, we have presented an approach to successfully assemble descriptions of long-tail entities from relational HTML tables using supervised matching methods and manually labeled training data in the form of positive and negative entity matches. Manually labeling training data is a laborious task given knowledge bases covering many different classes. In this work, we investigate reducing the labeling effort for the task of long-tail entity extraction by using weak supervision. We present a bootstrapping approach that requires domain experts to provide a small set of simple, class-specific matching rules, instead of requiring them to label a large set of entity matches, thereby reducing the human supervision effort considerably. We evaluate this weak supervision approach and find that it performs only slightly worse compared to methods that rely on large sets of manually labeled entity matches.

Highlights

Cross-domain knowledge bases like YAGO [8], DBpedia [9], Wikidata [20], or the Google Knowledge Graph are being employed for an increasing range of applications, including natural language processing, web search, and question answering
Data from relational web tables can be used to augment cross-domain knowledge bases like DBpedia, Wikidata, or the Google Knowledge Graph with descriptions of entities that are not yet part of the knowledge base
We present a bootstrapping approach that requires domain experts to provide a small set of simple, class-specific matching rules, instead of requiring them to label a large set of entity matches, thereby reducing the human supervision effort considerably

Summary

Introduction

Cross-domain knowledge bases like YAGO [8], DBpedia [9], Wikidata [20], or the Google Knowledge Graph are being employed for an increasing range of applications, including natural language processing, web search, and question answering. The entity coverage of knowledge bases is far from complete [4,16]. YAGO and DBpedia e.g. rely on data extracted from Wikipedia and as a result cover mostly head instances that fulfill the Wikipedia notability criteria [12]. As the utility of a knowledge base increases for many tasks with its completeness, adding long-tail entities to a knowledge base is an important task. In previous work [12], we proposed a method for extracting long-tail entities and showed that web tables are a promising source for augmenting knowledge bases c The Author(s) 2019 M.

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Using Weak Supervision to Identify Long-Tail Entities for Knowledge Base Completion

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2019
Citations: 2	License type: CC BY 4.0

Similar Papers

Learning Knowledge Graph Embedding with Entity Descriptions based on LSTM Networks
Chen Wenrui ... Zheng Chao
-
Chen Wenrui, et. al.Chen Wenrui ... Zheng Chao
06 Nov 2020
06 Nov 2020

Knowledge Graph Embedding With Interactive Guidance From Entity Descriptions
Wen'An Zhou ... Shirui Wang
IEEE Access | VOL. 7
Wen'An Zhou, et. al.Wen'An Zhou ... Shirui Wang
01 Jan 2019
IEEE Access | VOL. 7

A knowledge graph completion model integrating entity description and network structure
Chuanming Yu ... Lu An
Aslib Journal of Information Management | VOL. 75
Chuanming Yu, et. al.Chuanming Yu ... Lu An
08 Jul 2022
Aslib Journal of Information Management | VOL. 75

Distributed representation learning for knowledge graphs with entity descriptions
Miao Fan ... Ralph Grishman
Pattern Recognition Letters | VOL. 93
Miao Fan, et. al.Miao Fan ... Ralph Grishman
20 Sep 2016
Pattern Recognition Letters | VOL. 93

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Using Weak Supervision to Identify Long-Tail Entities for Knowledge Base Completion

Abstract

Highlights

Summary

Talk to us

Similar Papers