VEDAS: an efficient GPU alternative for store and query of large RDF data sets

Pisit Makpaisit,Chantana Chantrapornchai

doi:10.1186/s40537-021-00513-y

Pisit Makpaisit, Chantana Chantrapornchai

Open Access

https://doi.org/10.1186/s40537-021-00513-y

Copy DOI

Journal: Journal of Big Data	Publication Date: Sep 16, 2021
Citations: 3	License type: open-access

Affiliation: Kasetsart University

Abstract

Resource Description Framework (RDF) is commonly used as a standard for data interchange on the web. The collection of RDF data sets can form a large graph which consumes time to query. It is known that modern Graphic Processing Units (GPUs) can be employed to execute parallel programs in order to speedup the running time. In this paper, we propose a novel RDF data representation along with the query processing algorithm that is suitable for GPU processing. Since the main challenges of GPU architecture are the limited memory sizes, the memory transfer latency, and the vast number of GPU cores. Our system is designed to strengthen the use of GPU cores and reduce the effect of memory transfer. We propose a representation consists of indices and column-based RDF ID data that can reduce the GPU memory requirement. The indexing and pre-upload filtering techniques are then applied to reduce the data transfer between the host and GPU memory. We add the index swapping process to facilitate the sorting and joining data process based on the given variable and add the pre-upload step to reduce the size of results’ storage, and the data transfer time. The experimental results show that our representation is about 35% smaller than the traditional NT format and 40% less compared to that of gStore. The query processing time can be speedup ranging from 1.95 to 397.03 when compared with RDF3X and gStore processing time with WatDiv test suite. It achieves speedup 578.57 and 62.97 for LUBM benchmark when compared to RDF-3X and gStore. The analysis shows the query cases which can gain benefits from our approach.

Highlights

The Resource Description Framework (RDF) was proposed by W3C as a data exchange standard in semantic web
We are interested to utilize the Graphic Processing Units (GPUs) to improve the query performance for RDF data
Due to the above constraints on GPU, we develop the RDF compact representation and introduce the query processing framework that is suitable for GPU processing

Summary

Introduction

The Resource Description Framework (RDF) was proposed by W3C as a data exchange standard in semantic web. With the help of pre-upload filter for given id, we bound the id values to reduce the number of tuples which is to reduce the data transfer to the GPU memory and the index swapping time. The on-demand option is to upload the triple data to the GPU memory based on the filter and selection method described in Section "VEDAS framework and operations" indicated by the subquery. That is it uploads the triple-IDs only when needed. VEDAS obtains the speedup more especially on class C This is because the intermediate result after the join operation is large; the GPU performs better in this case (see Table 7). This query has 17.5M rows to upload and only 183 rows

G 27 G 45 G

F F1 F2 F3 F4

Result size

C2 C3 F1 F2 F3 F4 F5 S1 S2 S3 S4 S5 S6 S7 L1 L2 L3 L4 L5

Conclusion and future work

National Inventory of Natural Heritage

Findings

Wikipedia

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

VEDAS: an efficient GPU alternative for store and query of large RDF data sets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Big Data

Lead the way for us

Similar Papers

Combined bit map representation and its applications to query processing of resource description framework on GPU
Chantana Chantrapornchai ... Chidchanok Choksuchat
International Journal of High Performance Computing and Networking | VOL. 15
Chantana Chantrapornchai, et. al.Chantana Chantrapornchai ... Chidchanok Choksuchat
01 Jan 2019
International Journal of High Performance Computing and Networking | VOL. 15

TripleID: A Low-Overhead Representation and Querying Using GPU for Large RDFs
Chantana Chantrapornchai ... Chidchanok Choksuchat
-
Chantana Chantrapornchai, et. al.Chantana Chantrapornchai ... Chidchanok Choksuchat
01 Jan 2015
01 Jan 2015

Practical parallel string matching framework for RDF entailments with GPUs
Chidchanok Choksuchat ... Chantana Chantrapornchai
Information Systems Frontiers | VOL. 20
Chidchanok Choksuchat, et. al.Chidchanok Choksuchat ... Chantana Chantrapornchai
26 Sep 2016
Information Systems Frontiers | VOL. 20

Fast Processing SPARQL Queries on Large RDF Data
Guang Yang ... Pingpeng Yuan
-
Guang Yang, et. al.Guang Yang ... Pingpeng Yuan
01 Aug 2016
01 Aug 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

VEDAS: an efficient GPU alternative for store and query of large RDF data sets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Big Data