RDFFrames: knowledge graph access for machine learning tools

Aisha Mohamed,Ashraf Aboulnaga,Abdurrahman Ghanem,Zoi Kaoudi,Ghadeer Abuoda

doi:10.1007/s00778-021-00690-5

Aisha Mohamed, Ashraf Aboulnaga + Show 3 more

Open Access

https://doi.org/10.1007/s00778-021-00690-5

Copy DOI

Abstract

Knowledge graphs represented as RDF datasets are integral to many machine learning applications. RDF is supported by a rich ecosystem of data management systems and tools, most notably RDF database systems that provide a SPARQL query interface. Surprisingly, machine learning tools for knowledge graphs do not use SPARQL, despite the obvious advantages of using a database system. This is due to the mismatch between SPARQL and machine learning tools in terms of data model and programming style. Machine learning tools work on data in tabular format and process it using an imperative programming style, while SPARQL is declarative and has as its basic operation matching graph patterns to RDF triples. We posit that a good interface to knowledge graphs from a machine learning software stack should use an imperative, navigational programming paradigm based on graph traversal rather than the SPARQL query paradigm based on graph patterns. In this paper, we present RDFFrames, a framework that provides such an interface. RDFFrames provides an imperative Python API that gets internally translated to SPARQL, and it is integrated with the PyData machine learning software stack. RDFFrames enables the user to make a sequence of Python calls to define the data to be extracted from a knowledge graph stored in an RDF database system, and it translates these calls into a compact SPQARL query, executes it on the database system, and returns the results in a standard tabular format. Thus, RDFFrames is a useful tool for data preparation that combines the usability of PyData with the flexibility and performance of RDF database systems.

Highlights

There has recently been a sharp growth in the number of knowledge graph datasets that are made available in the RDF (Resource Description Framework)1 data model
The remaining time is spent on issuing the query to the engine and retrieving the results. This is typical in all our experiments: RDFFrames needs a few milliseconds to generate the SPARQL query and the remaining time is spent on query processing
The query produced by naive query generation did not finish in one hour and we terminated it after this time, which demonstrates the need for RDFFrames to generate optimized SPARQL and not rely exclusively on the query optimizer

Summary

Introduction

There has recently been a sharp growth in the number of knowledge graph datasets that are made available in the RDF (Resource Description Framework) data model. This ecosystem includes standard serialization formats, parsing and processing libraries, and most notably RDF database management systems (a.k.a. RDF engines or triple stores) that support SPARQL, the W3C standard query language for RDF data. RDF engines or triple stores) that support SPARQL, the W3C standard query language for RDF data Examples of these systems include OpenLink Virtuoso, Apache Jena, and managed services such as Amazon Neptune.

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: The VLDB Journal	Publication Date: Aug 26, 2021
Citations: 4	License type: open-access

R Discovery Prime

R Discovery Prime

RDFFrames: knowledge graph access for machine learning tools

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: The VLDB Journal

Lead the way for us

Similar Papers

RDFFrames
Aisha Mohamed ... Ghadeer Abuoda
Proceedings of the VLDB Endowment | VOL. 13
Aisha Mohamed, et. al.Aisha Mohamed ... Ghadeer Abuoda
01 Aug 2020
Proceedings of the VLDB Endowment | VOL. 13

Machine-Learning Implementation in Clinical Anesthesia: Opportunities and Challenges.
Danton S Char ... Alyssa Burgart
Anesthesia & Analgesia | VOL. 130
Danton S Char, et. al.Danton S Char ... Alyssa Burgart
01 Jun 2020
Anesthesia & Analgesia | VOL. 130

Developing Machine Learning Skills With No-Code Machine Learning Tools
Emmanuel Djaba ... Joseph Budu
-
Emmanuel Djaba, et. al.Emmanuel Djaba ... Joseph Budu
14 Oct 2022
14 Oct 2022

Renovation in environmental, social and governance (ESG) research: the application of machine learning
Abby Yaqing Zhang ... Joseph H Zhang
Asian Review of Accounting | VOL. 32
Abby Yaqing Zhang, et. al.Abby Yaqing Zhang ... Joseph H Zhang
10 Nov 2023
Asian Review of Accounting | VOL. 32

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

RDFFrames: knowledge graph access for machine learning tools

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: The VLDB Journal