Can We Predict New Facts with Open Knowledge Graph Embeddings? A Benchmark for Open Link Prediction

Samuel Broscheit,Kiril Gashteovski,Rainer Gemulla,Yanjie Wang

doi:10.18653/v1/2020.acl-main.209

Abstract

Open Information Extraction systems extract (“subject text”, “relation text”, “object text”) triples from raw text. Some triples are textual versions of facts, i.e., non-canonicalized mentions of entities and relations. In this paper, we investigate whether it is possible to infer new facts directly from the open knowledge graph without any canonicalization or any supervision from curated knowledge. For this purpose, we propose the open link prediction task,i.e., predicting test facts by completing (“subject text”, “relation text”, ?) questions. An evaluation in such a setup raises the question if a correct prediction is actually a new fact that was induced by reasoning over the open knowledge graph or if it can be trivially explained. For example, facts can appear in different paraphrased textual variants, which can lead to test leakage. To this end, we propose an evaluation protocol and a methodology for creating the open link prediction benchmark OlpBench. We performed experiments with a prototypical knowledge graph embedding model for openlink prediction. While the task is very challenging, our results suggests that it is possible to predict genuinely new facts, which can not be trivially explained.

Highlights

A knowledge graph (KG) (Hayes-Roth, 1983) is a set of-triples, where the subject and object correspond to vertices, and relations to labeled edges
We can view Open information extraction systems (OIE) data as an open knowledge graph (OKG) (Galarraga et al, 2014), in which vertices correspond to mentions of entities and edges to open relations
To experimentally explore whether it is possible to predict new facts, we focus on knowledge graph embedding (KGE) models (Nickel et al, 2016), which have been applied successfully to LP in KGs

Summary

Introduction

A knowledge graph (KG) (Hayes-Roth, 1983) is a set of (subject, relation, object)-triples, where the subject and object correspond to vertices, and relations to labeled edges. We can view OIE data as an open knowledge graph (OKG) (Galarraga et al, 2014), in which vertices correspond to mentions of entities and edges to open relations (see Fig. 1). Given the question (“NBC-TV”, “has office in”, ?), correct answers include “NYC” and “New York”; see Fig. 2b). A simple but problematic way to transfer this approach to OKGs is to sample a set of evaluation triples from the OKG and to use the remaining part of the OKG for training To see why this approach is problematic, consider the test triple (“NBC-TV”, “has office in”, “New York”) and suppose that the triple (“NBC”, “has headquarter in”, “NYC”) is part of the OKG. We show that paraphrasing and non-relational information can dilute performance evaluation, but can be remedied by appropriate dataset construction and experimental settings

Open Knowledge Graphs

Evaluation protocol

Creating the Open Link Prediction Benchmark OLPBENCH

Source Dataset

Evaluation Data

Training Data

Open Knowledge Graph Embeddings

Models and Training

Results

Conclusion

A Related Work

B Dataset creation

Multi-Label Binary Classification Batch-Negative Example Loss

Training settings

D Performance Metrics

E Additional Results

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Can We Predict New Facts with Open Knowledge Graph Embeddings? A Benchmark for Open Link Prediction

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2020
Citations: 29	License type: cc-by

Similar Papers

Fine-Grained Evaluation of Knowledge Graph Embedding Models in Downstream Tasks
Yuxin Zhang ... Meng Wang
-
Yuxin Zhang, et. al.Yuxin Zhang ... Meng Wang
01 Jan 2020
01 Jan 2020

Fine-Grained Evaluation of Knowledge Graph Embedding Model in Knowledge Enhancement Downstream Tasks
Yuxin Zhang ... Ye Ji
Big Data Research | VOL. 25
Yuxin Zhang, et. al.Yuxin Zhang ... Ye Ji
02 Mar 2021
Big Data Research | VOL. 25

Rule-based data augmentation for knowledge graph embedding
Guangyao Li ... Wei Hu
AI Open | VOL. 2
Guangyao Li, et. al.Guangyao Li ... Wei Hu
01 Jan 2020
AI Open | VOL. 2

Relation-based multi-type aware knowledge graph embedding
Yingying Xue ... Kaixuan Wang
Neurocomputing | VOL. 456
Yingying Xue, et. al.Yingying Xue ... Kaixuan Wang
11 May 2021
Neurocomputing | VOL. 456

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Can We Predict New Facts with Open Knowledge Graph Embeddings? A Benchmark for Open Link Prediction

Abstract

Highlights

Summary

Talk to us

Similar Papers