Abstract

Knowledge graph (KG) publishes machine-readable representation of knowledge on the Web. Structured data in the knowledge graph is published using Resource Description Framework (RDF) where knowledge is represented as a triple (subject, predicate, object). Due to the presence of erroneous, outdated or conflicting data in the knowledge graph, the quality of facts cannot be guaranteed. Therefore, the provenance of knowledge can assist in building up the trust of these knowledge graphs. In this paper, we have provided an analysis of popular, general knowledge graphs Wikidata and YAGO4 with regard to the representation of provenance and context data. Since RDF does not support metadata for providing provenance and contextualization, an alternate method, RDF reification is employed by most of the knowledge graphs. Trustworthiness of facts in knowledge graph can be enhanced by the addition of metadata like the source of information, location and time of the fact occurrence. Wikidata employs qualifiers to include metadata to facts, while YAGO4 collects metadata from Wikidata qualifiers. RDF reification increases the magnitude of data as several statements are required to represent a single fact. However, facts in Wikidata and YAGO4 can be fetched without using reification. Another limitation for applications that uses provenance data is that not all facts in these knowledge graphs are annotated with provenance data. Structured data in the knowledge graph is noisy. Therefore, the reliability of data in knowledge graphs can be increased by provenance data. To the best of our knowledge, this is the first paper that investigates the method and the extent of the addition of metadata of two prominent KGs, Wikidata and YAGO4.

Highlights

  • Knowledge regarding the real world entities in machine-readable format is furnished by Knowledge Graphs (KGs)

  • KGs can be created by the extraction of structured knowledge from data sources like Wikipedia, collected by Artificial Intelligent (AI) projects, imported from other data sets, or by crowd-sourcing

  • Trust in the data can be increased by providing additional information like the source of information or contextual information like the time or the location in which this fact was true or any other relevant additional information pertaining to a fact.[2]

Read more

Summary

Introduction

Knowledge regarding the real world entities in machine-readable format is furnished by Knowledge Graphs (KGs). Blank nodes of type Statement together with properties subject, predicate and object are used for representing the triple. Singleton property provided formal semantics to RDF reification and the number of triples describing fact is reduced. This reification is based on the idea that the relation between two specific entities is unique. Any number of metadata can be added to this fact with singleton property birthplace_1 as the subject This will introduce a large number of unique predicates in the KG. RDF* data model is much more compact than other reification approaches and does not introduce any extra predicates like in singleton properties. Frey et al provide an alternative to overcome this by adding resource nodes for each annotation group.[11]

Methods
Conclusions
Bienvenu M
Hartig O
10. Hartig O
13. Patel-Schneider PF
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call