A SPARQL query processing system using map-phase-multi join for big data in clouds

Sheng Wei Huang,Ming Fong Tsai,Chia Ho Yu,Ce Kuen Shieh

doi:10.1504/ijipt.2017.087555

Abstract

Big data refers to large datasets which are huge, complex and hard to be stored and analysed by traditional data processing tools. Linked data is one of the approaches to deal with big data which are stored and processed in TripleStore. For huge dataset, TripleStore requires more scalable techniques. 'MapReduce' programming model is the most representative of cloud technology. There are several approaches using MapReduce to serve SPARQL query but still exhibit unacceptable performance in complex queries. In this paper, we propose a map-phase-multi-join algorithm for processing SPARQL queries. Using multi-join, job initialisation time is reduced by avoiding iterative of MapReduce jobs. Furthermore, map-phase join can save bandwidth by preventing join-less data to be transferred among computing nodes. We also design a storage schema and a join-order rule which enhance the performance of our system. The evaluation results show that our system outperforms traditional join approaches in most queries.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A SPARQL query processing system using map-phase-multi join for big data in clouds

Abstract

Talk to us

Similar Papers

More From: International Journal of Internet Protocol Technology

Lead the way for us

Journal: International Journal of Internet Protocol Technology	Publication Date: Jan 1, 2017
Citations: 2

Similar Papers

A SPARQL query processing system using map-phase-multi join for big data in clouds
Ce Kuen Shieh ... Ming Fong Tsai
International Journal of Internet Protocol Technology | VOL. 10
Ce Kuen Shieh, et. al.Ce Kuen Shieh ... Ming Fong Tsai
01 Jan 2017
International Journal of Internet Protocol Technology | VOL. 10

S3QLRDF: distributed SPARQL query processing using Apache Spark—a comparative performance study
Mahmudul Hassan ... Srividya Bansal
Distributed and Parallel Databases | VOL. 41
Mahmudul Hassan, et. al.Mahmudul Hassan ... Srividya Bansal
24 Jan 2023
Distributed and Parallel Databases | VOL. 41

SigMR: MapReduce-based SPARQL query processing by signature encoding and multi-way join
Jinhyun Ahn ... Hong-Gee Kim
The Journal of Supercomputing | VOL. 71
Jinhyun Ahn, et. al.Jinhyun Ahn ... Hong-Gee Kim
07 Jun 2015
The Journal of Supercomputing | VOL. 71

Disease Surveillance System for Big Climate Data Processing and Dengue Transmission
Gunasekaran Manogaran ... Daphne Lopez
-
Gunasekaran Manogaran, et. al.Gunasekaran Manogaran ... Daphne Lopez
01 Jan 2019
01 Jan 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A SPARQL query processing system using map-phase-multi join for big data in clouds

Abstract

Talk to us

Similar Papers

More From: International Journal of Internet Protocol Technology