Hadoop MapReduce Job Scheduling Algorithms Survey and Use Cases

Alaa A Abdallat,Jaber A Alwidian,Duaa A Alsahebalt Amimi,Arwa I Alahmad

doi:10.5539/mas.v13n7p38

Alaa A Abdallat, Jaber A Alwidian + Show 2 more

Open Access

https://doi.org/10.5539/mas.v13n7p38

Copy DOI

Abstract

Data is the fastest growing asset in the 21st century, extracting insights is becoming of the essence as the traditional ecosystems are incapable to process the resulting amounts, complying with different structural levels, and is rapidly produced. Along this paradigm, the need for processing mostly real time data among other factors highlights the need for optimized Job Scheduling Algorithms, which is the interest of this paper. It is one of the most important aspects to guarantee an efficient processing ecosystem with minimal execution time, while exploiting the available resources taking into consideration granting all the users a fair share of the dedicated resources. Through this work, we lay some needed background on the Hadoop MapReduce framework. We run a comparative analysis on different algorithms that are classified on different criteria. The light is shed on different classifications: Cluster Environment, Job Allocation Strategy, Optimization Strategy, and Metrics of Quality. We, also, construct use cases to showcase the characteristics of selected Job Scheduling Algorithms, then we present a comparative display featuring the details for the use cases.

Highlights

The new digital world is growing, new day-to-day habits are adopted and every aspect of the world as we previously know has a digital equivalent.Connectivity changing from a luxury to a necessity changed the role the internet is playing, the massive increase in data generating devices and end users, the emergence of the modern terms like the internet of things(IoT) and the new digital life through Social Media are all affecting the amount of data being generated, stored and in need to be processed (Amir & Murtaza, 2015), as a result big data and its frameworks or ecosystems are the keywords used to indicate the need for distributed, parallel computing
In dis- tributed, parallel fashion (Hashem et al, 2016). It had its default job scheduling mechanism that was based on FIFO (Senthilkumar Ilango, 2016), which later was removed from MapReduce as a component and is considered a plug- gable component allowing MapReduce Job Scheduling algorithm and technique to be customized per project
Scheduling has been a persistent issue for a long time in various systems and clusters, the need to run the tasks and allocate the resources to said tasks summarizes the idea behind Job Scheduling, which has been and continuous to be an area of interest in the research field(Mohamed & Hong, 2016)

Summary

Introduction

The new digital world is growing, new day-to-day habits are adopted and every aspect of the world as we previously know has a digital equivalent. Vol 13, No 7; 2019 perspective, job schedulers are aiming to tackle a few issues resulting from the MapReduce paradigm, resources managers and nego- tiators Untimely, those schedulers are sometimes used along with optimizers to achieve a certain objective or set of objectives given a constraint acting as heuristic (Hashem et al, 2018). Our motive for this work, we highlight the current algorithms for job scheduling along with their drawbacks and strength points, to help find a new algorithm or a hybrid of existing algorithms, acting as a unified ecosystem addressing the issues that are considered to be a major drawback for other algorithms This survey is organized into five chapters.

MapReduce

Hadoop MapReduce Architecture

Locality

Synchronization

Fairness

Job Scheduling Algorithms

Fair Scheduler

Capacity Scheduler

Delay Scheduler

Matchmaking Scheduler

LATE: “Longest Approximate Time-to-end Scheduler”

Deadline Constraint Scheduler

Resource Aware Scheduler

Job Scheduling Classification

Optimization Strategy

Multi-objective

Hadoop Schedulers Use Cases

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Modern Applied Science	Publication Date: Jun 25, 2019
Citations: 4	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Hadoop MapReduce Job Scheduling Algorithms Survey and Use Cases

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Modern Applied Science

Lead the way for us

Similar Papers

Job Scheduling Algorithms in Cloud Computing: A Survey
Himanshu Goel ... Narendra Chamoli
International Journal of Computer Applications | VOL. 95
Himanshu Goel, et. al.Himanshu Goel ... Narendra Chamoli
18 Jun 2014
International Journal of Computer Applications | VOL. 95

Job scheduling and processor allocation for grid computing on metacomputers
Keqin Li
Journal of Parallel and Distributed Computing | VOL. 65
Keqin LiKeqin Li
14 Jul 2005
Journal of Parallel and Distributed Computing | VOL. 65

An efficient and robust parallel scheduler for bioinformatics applications in a public cloud: A bigdata approach
Leena Ammanna ...
Indonesian Journal of Electrical Engineering and Computer Science | VOL. 25
Leena Ammanna, et. al.Leena Ammanna ...
01 Feb 2022
Indonesian Journal of Electrical Engineering and Computer Science | VOL. 25

Implementation of Big-Data Applications Using Map Reduce Framework
Kapil Sahu ... Kaveri Bhatt
International Journal of Engineering and Computer Science | VOL. 9
Kapil Sahu, et. al.Kapil Sahu ... Kaveri Bhatt
12 Aug 2020
International Journal of Engineering and Computer Science | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Hadoop MapReduce Job Scheduling Algorithms Survey and Use Cases

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Modern Applied Science