Abstract

Hadoop YARN has become a dominant framework for big data analysis and processing. However, the inbuilt scheduler in Hadoop YARN framework is not designed for energy efficiency. To overcome this problem, this paper presents an energy-efficient extension on the existing Hadoop YARN framework. In addition, we formulate the MapReduce scheduling in the heterogeneous Hadoop YARN cluster as an energy consumption optimization problem, and propose a heuristic algorithm to solve this optimization problem. The proposed algorithm takes advantage of both load balancing and dynamic voltage/frequency scaling to improve performance and energy efficiency of the Hadoop YARN cluster. We evaluate the effectiveness of our method by carrying out extensive experiments on a real Hadoop YARN cluster consisting of five servers. The results show that our method can provide significant energy savings and achieve better performance compared with three alternative methods applied to similar problems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call