Abstract
SummaryMap‐Reduce is a programming paradigm widely used for data intensive applications in distributed computing. Divisible load theory (DLT) is another model to partition divisible load for parallel processing. In this work, a new job scheduler named virtual job scheduler (VJS) is developed to schedule the MapReduce jobs based on DLT. VJS constructs a virtual job set from the queue of jobs awaiting execution by considering the CPU and IO resource utilization levels of each job. The core of VJS is in its partitioning algorithm. Two novel partitioning algorithms, namely, two level successive partitioning (TLSP) and predictive partitioning (PRED) have been proposed. TLSP applies DLT to both map and reduce phase in succession. This second load partitioning performed during the comparatively longer reduce phase, exploits the advantage of DLT better. PRED is a modification of TLSP aimed at optimizing the overall schedule length, by taking the idle time of the reducers into consideration. Evaluation of these two models across various execution environments has indicated a profound decrease in the wait time of reducers and as a result has shown a significant reduction in the makespan of the whole job as such. VJS altered to incorporate PRED partitioning algorithm has proved to be suitable for heterogeneous environment, with high resource utilization compared with the default Hadoop MapReduce scheduler.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Concurrency and Computation: Practice and Experience
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.