CRESP: Towards Optimal Resource Provisioning for MapReduce Computing in Public Clouds

Keke Chen,Fengguang Tian,James Powers,Shumin Guo

doi:10.1109/tpds.2013.297

Abstract

Running MapReduce programs in the cloud introduces this unique problem: how to optimize resource provisioning to minimize the monetary cost or job finish time for a specific job? We study the whole process of MapReduce processing and build up a cost function that explicitly models the relationship among the time cost, the amount of input data, the available system resources (Map and Reduce slots), and the complexity of the Reduce function for the target MapReduce job. The model parameters can be learned from test runs. Based on this cost function, we can solve a number of decision problems, such as the optimal amount of resources that can minimize monetary cost within a job finish deadline, minimize time cost under a certain monetary budget, or find the optimal tradeoffs between time and monetary costs. Experimental results show that the proposed approach performs well on a number of sample MapReduce programs in both the in-house cluster and Amazon EC2. We also conducted a variance analysis on different components of the MapReduce workflow to show the possible sources of modeling error. Our optimization results show that with the proposed approach we can save a significant amount of time and money, compared to randomly selected settings.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

CRESP: Towards Optimal Resource Provisioning for MapReduce Computing in Public Clouds

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems

Lead the way for us

Journal: IEEE Transactions on Parallel and Distributed Systems	Publication Date: Jun 1, 2014
Citations: 78

Similar Papers

Towards Optimal Resource Provisioning for Running MapReduce Programs in Public Clouds
Fengguang Tian ... Keke Chen
-
Fengguang Tian, et. al.Fengguang Tian ... Keke Chen
01 Jul 2011
01 Jul 2011

Optimal Resource Provisioning Approach based on Cost Modeling for Spark Applications in Public Clouds
Jianfei Ruan ... Bo Dong
-
Jianfei Ruan, et. al.Jianfei Ruan ... Bo Dong
07 Dec 2015
07 Dec 2015

Time and energy costs of different foraging choices in an avian generalist species
Alejandro Sotillo ... Eric W M Stienen
Movement Ecology | VOL. 7
Alejandro Sotillo, et. al.Alejandro Sotillo ... Eric W M Stienen
01 Dec 2019
Movement Ecology | VOL. 7

A tool to minimize the time costs of parallel computations through optimal processing power allocation
Bin Qin ... Howard A Sholl
Software: Practice and Experience | VOL. 20
Bin Qin, et. al.Bin Qin ... Howard A Sholl
01 Mar 1990
Software: Practice and Experience | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CRESP: Towards Optimal Resource Provisioning for MapReduce Computing in Public Clouds

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems