Abstract

Abstract The MapReduce framework has become a leading scheme for processing large-scale data applications in recent years. However, big data applications executed on computer clusters require a large amount of energy, which costs a considerable fraction of the data center’s overall costs. Therefore, for a data center, how to reduce the energy consumption becomes a critical issue. Although Hadoop YARN adopts fine-grained resource management schemes for job scheduling, it doesn’t consider the energy saving problem. In this paper, an Energy-aware Fair Scheduling framework based on YARN (denoted as EFS) is proposed, which can effectively reduce energy consumption while meet the required Service Level Agreements (SLAs). EFS not only can schedule jobs to energy-efficiency nodes, but also can power on or off the nodes. To do so, the energy-aware dynamic capacity management with deadline-driven policy is used to allocate the resources for MapReduce tasks in terms of the average execution time of containers and users request resources. And then, Energy-aware fair based scheduling problem is modeled as multi-dimensional knapsack problem (MKP) and the energy-aware greedy algorithm (EAGA) is proposed to realize tasks fine-grained placement on energy-efficient nodes. Finally, the nodes which have been kept in idle state for the threshold duration are turned off to reduce energy costs. We perform extensive experiments on the Hadoop YARN clusters to compare the energy consumption and executing time of EFS with some state-of-the-art policies. The experimental results show that EFS can not only keep the proper number of nodes in on states to meet the computing requirements but also achieve the goal of energy savings.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call