Consuming Hadoop MapReduce via virtual infrastructure as a service is becoming common practice as cloud service providers (CSP) offers relevant applications and scalable resources. One of the predominant requirements of cloud users is to improve resource utilization in the virtual cluster during the service period. However, it may not be possible when MapReduce workloads and virtual machines (VM) are highly heterogeneous. Therefore, in this paper, we addressed these heterogeneities and proposed an efficient MapReduce scheduler to improve resource utilization by placing the right combination of the map and reduce tasks in each VM in the virtual cluster. To achieve this, we transformed the MapReduce task scheduling problem into a 2-Dimensional (2D) bin packing model and obtained an optimal schedule using the ant colony optimization (ACO) algorithm. As an added advantage, our proposed ACO based bin packing (ACO-BP) scheduler minimized the makespan for a batch of jobs. To showcase the performance improvement, we compared our proposed scheduler with three existing schedulers that work well in a heterogeneous environment. As expected, results show that ACO-BP significantly outperformed the existing schedulers while dealing with workload and VM level heterogeneities.
Read full abstract7-days of FREE Audio papers, translation & more with Prime
7-days of FREE Prime access