Abstract

MapReduce framework in no time established as a vital distributed model for the applications which are data-intensive. Hadoop default scheduler is restricted by the idea that cluster nodes are homogeneous. The job execution time is extended by the tasks and TaskTrackers which are running slowly in heterogeneous Hadoop cluster. In this paper, we propose a unique MapReduce scheduler that identifies the straggler tasks and TaskTrackers that are running fast in an exceedingly heterogeneous Hadoop cluster so that the JobTracker can assigns slow tasks to the fast TaskTrackers within the cluster. We observe that the experimental results shows consistent improvement in performance to the LATE scheduler and Hadoop default scheduler for various workloads of Hi-Bench benchmark suite by minimizing the job completion time.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call