Faster cloud Star Joins with Reduced Disk Spill and Network Communication

Jaqueline Joice Brito,Thiago Mosqueiro,Ricardo Rodrigues Ciferri,Cristina Dutra De Aguiar Ciferri

doi:10.1016/j.procs.2016.05.299

Abstract

Combining powerful parallel frameworks and on-demand commodity hardware, cloud computing has made both analytics and decision support systems canonical to enterprises of all sizes. Associated with unprecedented volumes of data stacked by such companies, filtering and retrieving them are pressing challenges. This data is often organized in star schemas, in which Star Joins are ubiquitous and expensive operations. In particular, excessive disk spill and network communication are tight bottlenecks for all current MapReduce or Spark solutions. Here, we propose two efficient solutions that drop the computation time by at least 60%: the Spark Bloom-Filtered Cascade Join (SBFCJ) and the Spark Broadcast Join (SBJ). Conversely, a direct Spark implementation of a sequence of joins renders poor performance, showcasing the importance of further filtering for minimal disk spill and network communication. Finally, while SBJ is twice faster when memory per executor is large enough, SBFCJ is remarkably resilient to low memory scenarios. Both algorithms pose very competitive solutions to Star Joins in the cloud.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Procedia Computer Science	Publication Date: Jan 1, 2016
Citations: 8	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

Faster cloud Star Joins with Reduced Disk Spill and Network Communication

Abstract

Talk to us

Similar Papers

More From: Procedia Computer Science

Lead the way for us

Similar Papers

Bipartite-oriented distributed graph partitioning for big learning
Rong Chen ... Binyu Zang
-
Rong Chen, et. al.Rong Chen ... Binyu Zang
25 Jun 2014
25 Jun 2014

Bipartite-Oriented Distributed Graph Partitioning for Big Learning
Rong Chen ... Jia-Xin Shi
Journal of Computer Science and Technology | VOL. 30
Rong Chen, et. al.Rong Chen ... Jia-Xin Shi
01 Jan 2015
Journal of Computer Science and Technology | VOL. 30

Cloud Computing and Internet of Things Integration: Architecture, Applications, Issues, and Challenges
Akash Malik ... Hari Om
-
Akash Malik, et. al.Akash Malik ... Hari Om
21 Sep 2017
21 Sep 2017

Recent Trends of Cloud Computing Applications and Services in Medical, Educational, Financial, Library and Agricultural Disciplines
Omer K Jasim Mohammad
-
Omer K Jasim MohammadOmer K Jasim Mohammad
25 Jun 2018
25 Jun 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Faster cloud Star Joins with Reduced Disk Spill and Network Communication

Abstract

Talk to us

Similar Papers

More From: Procedia Computer Science