QPPDs: Querying Property Paths Over Distributed RDF Datasets

Qaiser Mehmood,Mathieu D'Aquin,Ratnesh Sahay,Muhammad Saleem,Axel-Cyrille Ngonga Ngomo

doi:10.1109/access.2019.2930416

Qaiser Mehmood, Mathieu D'Aquin + Show 3 more

Open Access

https://doi.org/10.1109/access.2019.2930416

Copy DOI

Abstract

A key property of linked data, i.e., the web-based representation and publication of data as interconnected labeled graphs, is that it enables querying and navigating through datasets distributed across the network. SPARQL1.1, the current standard query language for RDF-based linked data, defines a construct-called property paths (PP)-to navigate between the entities of a graph. This is potentially very useful in a number of use cases, e.g., in the biomedical domain, where large datasets are available as linked data graphs. However, the use of PP in SPARQL 1.1. is possible only on a single local graph, requiring us to merge all distributed datasets into one large, centrally stored graph, therefore reducing the value of using linked data in the first place. We propose an index-based approach-called QPPDs-for answering queries for paths distributed across multiple, distributed datasets. We provide a heuristic-based source selection mechanism to select the relevant datasets (also called data sources) for a given path query, and a technique that federates queries to selected sources, and assembles (merges) the paths (i.e., partial or complete) retrieved from those remote datasets. We demonstrate our approach on a genomics use-case, where the description of biological entities (e.g., genes, diseases, and drugs) is scattered across multiple datasets. In our preliminary investigation, we evaluate the QPPDs approach with real-world path queries-on biological data that are very heterogeneous in nature-in terms of performance (overall path retrieval time) and result completeness, i.e., the number of paths retrieved.

Highlights

The potential benefits of using Linked Data, have been increasingly considered in a variety of domains where rich, multi-source data need to be explored, e.g., bioinformatics, geography, literature, etc
MOTIVATING SCENARIO we present two motivating scenarios: (1) a real-world scenario showing the use of distributed property paths in RDF datasets for Cancer Genomics; and (2) a toy scenario which is used as a running example to explain the proposed approach
The motivation behind this work is the need of the BIOOPENER project, which aims at linking and discovery of linked data across cancer and biomedical data at publicly available distributed triple stores

Summary

INTRODUCTION

The potential benefits of using Linked Data ( known as the Web of Data or Semantic Web data), have been increasingly considered in a variety of domains where rich, multi-source data need to be explored, e.g., bioinformatics, geography, literature, etc. In the biomedical domain for example, a lot of data is available publicly from multiple, heterogeneous sources In such a case, it is very common for two biological entities (e.g., gene, protein, drug, pathway, etc.) to be related through paths formed of links going across several of those datasets. To find paths between two entities, the centralized approaches adopted by current systems pose some challenges such as: (i) querying multiple datasets requires the user to first merge them into a single graph, which is a cumbersome task; (ii) copied data need to be synchronized; and (iii) merged data might not be as up-to-date and fresh as in the original source; (iv) data is not always under control or fully accessible by the person querying it, and (v) scalability is a major issue in the centralized approaches.

MOTIVATING SCENARIO

PRELIMINARIES

RELATED WORK

THE QPPDS APPROACH

EVALUATION

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2019
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

QPPDs: Querying Property Paths Over Distributed RDF Datasets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

FedS: Towards Traversing Federated RDF Graphs
Qaiser Mehmood ... Ratnesh Sahay
-
Qaiser Mehmood, et. al.Qaiser Mehmood ... Ratnesh Sahay
01 Jan 2018
01 Jan 2018

Efficient distributed path computation on RDF knowledge graphs using partial evaluation
Qaiser Mehmood ... Muhammad Saleem
World Wide Web | VOL. 25
Qaiser Mehmood, et. al.Qaiser Mehmood ... Muhammad Saleem
04 Nov 2021
World Wide Web | VOL. 25

A Context-Based Semantics for SPARQL Property Paths Over the Web
Olaf Hartig ... Giuseppe Pirrò
-
Olaf Hartig, et. al.Olaf Hartig ... Giuseppe Pirrò
01 Jan 2015
01 Jan 2015

SPARQL with property paths on the Web
Olaf Hartig ... Giuseppe Pirrò
Semantic Web | VOL. 8
Olaf Hartig, et. al.Olaf Hartig ... Giuseppe Pirrò
01 Jan 2017
Semantic Web | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

QPPDs: Querying Property Paths Over Distributed RDF Datasets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access