Abstract

Knowledge graph (KG) publishes machine-readable representation of knowledge on the Web. Structured data in the knowledge graph is published using Resource Description Framework (RDF) where knowledge is represented as a triple (subject, predicate, object). Due to the presence of erroneous, outdated or conflicting data in the knowledge graph, the quality of facts cannot be guaranteed. Trustworthiness of facts in knowledge graph can be enhanced by the addition of metadata like the source of information, location and time of the fact occurrence. Since RDF does not support metadata for providing provenance and contextualization, an alternate method, RDF reification is employed by most of the knowledge graphs. RDF reification increases the magnitude of data as several statements are required to represent a single fact. Another limitation for applications that uses provenance data like in the medical domain and in cyber security is that not all facts in these knowledge graphs are annotated with provenance data. In this paper, we have provided an overview of prominent reification approaches together with the analysis of popular, general knowledge graphs Wikidata and YAGO4 with regard to the representation of provenance and context data. Wikidata employs qualifiers to include metadata to facts, while YAGO4 collects metadata from Wikidata qualifiers. However, facts in Wikidata and YAGO4 can be fetched without using reification to cater for applications that do not require metadata. To the best of our knowledge, this is the first paper that investigates the method and the extent of metadata covered by two prominent KGs, Wikidata and YAGO4.

Highlights

  • Knowledge regarding the real-world entities in machine-readable format is furnished by Knowledge Graphs (KGs)

  • KGs can be created by the extraction of structured knowledge from data sources like Wikipedia, collected by Artificial Intelligent (AI) projects, imported from other data sets, or by crowd-sourcing

  • Wikidata is equipped with higher level of contextualization and provenance data whereas YAGO4 is equipped with temporal information

Read more

Summary

Introduction

Knowledge regarding the real-world entities in machine-readable format is furnished by Knowledge Graphs (KGs). These large-scale KGs provide both domain-dependent and domain-independent knowledge for many applications like entity linking, information retrieval and several other data mining tasks. Trust in the data can be increased by providing additional information like the source of information or contextual information like the time or the location in which this fact was true or any other relevant additional information pertaining to a fact.[3] This extra information would support the authenticity of the data, and in return, can help the machines to extract correct facts for critical applications such as in medical domain and in cyber security

Methods
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.