IShuffle: Improving Hadoop Performance with Shuffle-on-Write

Yanfei Guo,Xiaobo Zhou,Dazhao Cheng,Jia Rao

doi:10.1109/tpds.2016.2587645

Yanfei Guo, Xiaobo Zhou + Show 2 more

Open Access

https://doi.org/10.1109/tpds.2016.2587645

Copy DOI

Abstract

Hadoop is a popular implementation of the MapReduce framework for running data-intensive jobs on clusters of commodity servers. Shuffle , the all-to-all input data fetching phase between the map and reduce phase can significantly affect job performance. However, the shuffle phase and reduce phase are coupled together in Hadoop and the shuffle can only be performed by running the reduce tasks. This leaves the potential parallelism between multiple waves of map and reduce unexploited and resource wastage in multi-tenant Hadoop clusters, which significantly delays the completion of jobs in a multi-tenant Hadoop cluster. More importantly, Hadoop lacks the ability to schedule task efficiently and mitigate the data distribution skew among reduce tasks, which leads to further degradation of job performance. In this work, we propose to decouple shuffle from reduce tasks and convert it into a platform service provided by Hadoop. We present iShuffle , a user-transparent shuffle service that pro-actively pushes map output data to nodes via a novel shuffle-on-write operation and flexibly schedules reduce tasks considering workload balance. Experimental results with representative workloads and Facebook workload trace show that iShuffle reduces job completion time by as much as 29.6 and 34 percent in single-user and multi-user clusters, respectively.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Parallel and Distributed Systems	Publication Date: Jun 1, 2017
Citations: 86	License type: publisher-specific, author manuscript

R Discovery Prime

R Discovery Prime

IShuffle: Improving Hadoop Performance with Shuffle-on-Write

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems

Lead the way for us

Similar Papers

An examination of the mediating effect of Islamic Work Ethic (IWE) on the relationship between job satisfaction and job performance in Arab work environment
Audai Naji Al Smadi ... Safiya Amaran
International Journal of Cross Cultural Management | VOL. 23
Audai Naji Al Smadi, et. al.Audai Naji Al Smadi ... Safiya Amaran
28 Aug 2022
International Journal of Cross Cultural Management | VOL. 23

Autonomy, workload, work-life balance and job performance among teachers
Johanim Johari ... Fee Yean Tan
International Journal of Educational Management | VOL. 32
Johanim Johari, et. al.Johanim Johari ... Fee Yean Tan
08 Jan 2018
International Journal of Educational Management | VOL. 32

A Study of the Effects of Individual Competenceson Job Satisfaction and Job Performance: For those working in the beauty service industry
Se-Eun Kim ... Su-Yeon Oh
Journal of the Korean Society of Cosmetology | VOL. 29
Se-Eun Kim, et. al.Se-Eun Kim ... Su-Yeon Oh
31 Oct 2023
Journal of the Korean Society of Cosmetology | VOL. 29

Effective Routing Algorithm Based on Software Defined Networking for Big Data Applications in Data Centre Network
Ali Khaleel ... Hamed Saffa Al-Raweshidy
-
Ali Khaleel, et. al.Ali Khaleel ... Hamed Saffa Al-Raweshidy
01 Aug 2018
01 Aug 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

IShuffle: Improving Hadoop Performance with Shuffle-on-Write

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems