An Efficiency-Aware Scheduling for Data-Intensive Computations on MapReduce Clusters

Hui Zhao,Shuqiang Yang,Hua Fan,Jinghu Xu,Zhikun Chen

doi:10.1587/transinf.e96.d.2654

Abstract

Scheduling plays a key role in MapReduce systems. In this paper, we explore the efficiency of an MapReduce cluster running lots of independent and continuously arriving MapReduce jobs. Data locality and load balancing are two important factors to improve computation efficiency in MapReduce systems for data-intensive computations. Traditional cluster scheduling technologies are not well suitable for MapReduce environment, there are some in-used schedulers for the popular open-source Hadoop MapReduce implementation, however, they can not well optimize both factors. Our main objective is to minimize total flowtime of all jobs, given it's a strong NP-hard problem, we adopt some effective heuristics to seek satisfied solution. In this paper, we formalize the scheduling problem as job selection problem, a load balance aware job selection algorithm is proposed, in task level we design a strict data locality tasks scheduling algorithm for map tasks on map machines and a load balance aware scheduling algorithm for reduce tasks on reduce machines. Comprehensive experiments have been conducted to compare our scheduling strategy with well-known Hadoop scheduling strategies. The experimental results validate the efficiency of our proposed scheduling strategy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEICE Transactions on Information and Systems	Publication Date: Jan 1, 2013
Citations: 1	License type: free

R Discovery Prime

R Discovery Prime

An Efficiency-Aware Scheduling for Data-Intensive Computations on MapReduce Clusters

Abstract

Talk to us

Similar Papers

More From: IEICE Transactions on Information and Systems

Lead the way for us

Similar Papers

Scheduling strategy of semiconductor production lines with remaining cycle time prediction
Li Li ... Qingyun Yu
-
Li Li, et. al.Li Li ... Qingyun Yu
01 Dec 2017
01 Dec 2017

Scheduling strategy of semiconductor production lines with remaining cycle time prediction
...
-
, et. al. ...
03 Dec 2017
03 Dec 2017

Energy-saving centric uplink scheduling scheme for broadband wireless access networks
Yen-Wen Chen ... Yen-Yin Chu
EURASIP Journal on Wireless Communications and Networking | VOL. 2014
Yen-Wen Chen, et. al.Yen-Wen Chen ... Yen-Yin Chu
01 May 2014
EURASIP Journal on Wireless Communications and Networking | VOL. 2014

Multi-objective scheduling strategy for scientific workflows in cloud environment: A Firefly-based approach
Mainak Adhikari ... Satish Narayana Srirama
Applied Soft Computing | VOL. 93
Mainak Adhikari, et. al.Mainak Adhikari ... Satish Narayana Srirama
21 May 2020
Applied Soft Computing | VOL. 93

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Efficiency-Aware Scheduling for Data-Intensive Computations on MapReduce Clusters

Abstract

Talk to us

Similar Papers

More From: IEICE Transactions on Information and Systems