Clydesdale

Andrey Balmin,Sandeep Tata,Tim Kaldewey

doi:10.1145/2213836.2213938

Abstract

There have been several recent proposals modifying Hadoop, radically changing the storage organization or query processing techniques to obtain good performance for structured data processing. We will showcase Clydesdale, a research prototype for structured data processing on Hadoop that can achieve dramatic performance improvements over existing solutions, without any changes to the underlying MapReduce implementation. Clydesdale achieves this through a novel synthesis of several techniques from the database literature and carefully adapting them to the Hadoop environment. On the star schema benchmark, we show that Clydesdale is on average 38x faster than Hive, the dominant approach for structured data processing on Hadoop today. To the best of our knowledge, Clydesdale is the fastest solution for processing workloads on structured data sets that fit a star schema on Hadoop. Attendees will be able to run queries on the data from the star schema benchmark on a remote Hadoop cluster with Clydesdale and Hive installed, and get a breakdown of the time taken to execute the query. Attendees will also be able to pose their own queries using ClyQL -- a novel embedded DSL in Scala that can be used to rapidly prototype star join queries. With this demonstration, we hope to convince the attendees that unlike previously thought, Hadoop can indeed efficiently support structured data processing.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Clydesdale

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Variations of the star schema benchmark to test the effects of data skew on query performance
Tilmann Rabl ... Elizabeth O'Neil
-
Tilmann Rabl, et. al.Tilmann Rabl ... Elizabeth O'Neil
21 Apr 2013
21 Apr 2013

Columnar NoSQL Star Schema Benchmark
Khaled Dehdouh ... Omar Boussaid
-
Khaled Dehdouh, et. al.Khaled Dehdouh ... Omar Boussaid
01 Jan 2014
01 Jan 2014

Clydesdale
Tim Kaldewey ... Eugene J Shekita
-
Tim Kaldewey, et. al.Tim Kaldewey ... Eugene J Shekita
27 Mar 2012
27 Mar 2012

스타 스키마 조인 처리에 대한 세로-지향 데이터베이스 시스템과 가로-지향 데이터베이스 시스템의 성능 비교
Byung-Jung Oh ... Soo-Min Ahn
Journal of the Korea Society of Computer and Information | VOL. 16
Byung-Jung Oh, et. al.Byung-Jung Oh ... Soo-Min Ahn
31 Aug 2011
Journal of the Korea Society of Computer and Information | VOL. 16

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Clydesdale

Abstract

Talk to us

Similar Papers