A parallel query processing system based on graph-based database partitioning

Yoon-Min Nam,Donghyoung Han,Min-Soo Kim

doi:10.1016/j.ins.2018.12.031

Abstract

As parallel database systems have large amounts of data to process, it is important to utilize a scalable and efficient horizontal database partitioning method. The existing partitioning methods have major drawbacks that not only cause large amounts of data redundancy but also still require expensive shuffle operations for join queries in many cases—despite their high data redundancy. We elucidate upon the drawbacks originating from the tree-based partitioning schemes and propose a novel graph-based database partitioning method called GPT that both improves the query performance and reduces data redundancy. We integrate the proposed GPT method into a parallel query processing system, Spark SQL, across all the relevant layers and modules, including the query plan generator and the scan operator. Through extensive experiments using three benchmarks, TPC-DS, IMDB and BioWarehouse, we show that GPT significantly outperforms the state-of-the-art method in terms of both storage overhead and query performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A parallel query processing system based on graph-based database partitioning

Abstract

Talk to us

Similar Papers

More From: Information Sciences

Lead the way for us

Journal: Information Sciences	Publication Date: Dec 21, 2018
Citations: 9

Similar Papers

A Graph-Based Database Partitioning Method for Parallel OLAP Query Processing
Yoon-Min Nam ... Min-Soo Kim
-
Yoon-Min Nam, et. al.Yoon-Min Nam ... Min-Soo Kim
01 Apr 2018
01 Apr 2018

Heuristic optimization of speedup and benefit/cost for parallel database scans on shared-memory multiprocessors
M Rys ... G Weikum
-
M Rys, et. al.M Rys ... G Weikum
01 Apr 1994
01 Apr 1994

Database processing models in parallel processing systems
Sakti Pramanik ... Myoung Ho Kim
-
Sakti Pramanik, et. al.Sakti Pramanik ... Myoung Ho Kim
01 Jan 1989
01 Jan 1989

Handbook on Parallel and Distributed Processing
-
-
--
01 Jan 1999
01 Jan 1999

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A parallel query processing system based on graph-based database partitioning

Abstract

Talk to us

Similar Papers

More From: Information Sciences