MapReduce Scheduling Algorithms Research Articles

Abstract Recently, valuable knowledge that can be retrieved from a huge volume of datasets (called Big Data) set in motion the development of frameworks to process data based on parallel and distributed computing, including Apache Hadoop, Facebook Corona, and Microsoft Dryad. Apache Hadoop is an open source implementation of Google MapReduce that attracted strong attention from the research community both in academia and industry. Hadoop MapReduce scheduling algorithms play a critical role in the management of large commodity clusters, controlling QoS requirements by supervising users, jobs, and tasks execution. Hadoop MapReduce comprises three schedulers: FIFO, Fair, and Capacity. However, the research community has developed new optimizations to consider advances and dynamic changes in hardware and operating environments. Numerous efforts have been made in the literature to address issues of network congestion, straggling, data locality, heterogeneity, resource under-utilization, and skew mitigation in Hadoop scheduling. Recently, the volume of research published in journals and conferences about Hadoop scheduling has consistently increased, which makes it difficult for researchers to grasp the overall view of research and areas that require further investigation. A scientific literature review has been conducted in this study to assess preceding research contributions to the Apache Hadoop scheduling mechanism. We classify and quantify the main issues addressed in the literature based on their jargon and areas addressed. Moreover, we explain and discuss the various challenges and open issue aspects in Hadoop scheduling optimizations.

Read full abstract

Many companies are increasingly using Map Reduce for inexperienced massive scale data processing together with personalized marketing , direct mail detection, and brilliant data mining obligations. Cloud computing offer an attractive desire for corporations to hire an appropriate length Hadoop cluster, use resources as a company, and pay simplest for property that has been implemented. One of the open questions in such environments is the amount of property that someone ought to rent from the service provider. Often, a patron goals particular well-known normal performance dreams and the software program desires to entire facts processing by means of the manner of way of a positive time cut-off date. However, currently, the mission of estimating required assets to satisfy software program, overall performance desires is most effective the clients’ obligation. In these , we introduce a unique framework and approach to deal with this problem and to provide a modern beneficial resource sizing and provisioning service in Map Reduce environments. For a Map Reduce technique that desires to be finished in interior a positive time, the challenging profile is constructed from the technique past executions or by way of executing the utility on a smaller statistics set the use of an automatic profiling device. Map Reduce application is used to acquire data in line with the request. To approach massive facts proper scheduling is required to gain extra well-known overall performance. Scheduling is a way of assigning jobs to available assets in a way to decrease hunger and maximize resource usage. Performance of scheduling technique may be advanced with the useful resource of using lessen-off date constraints on jobs. The goal is to examine Map Reduce and notable scheduling algorithms that can be used to benefit better overall performance.

Read full abstract

MapReduce Scheduling Algorithms Research Articles

Related Topics

Articles published on MapReduce Scheduling Algorithms

Jargon of Hadoop MapReduce scheduling techniques: a scientific categorization

MapReduce scheduling algorithms: a review

Exact and heuristic MapReduce scheduling algorithms for cloud federation

LEARN ON MAPREDUCE METHOD FOR JOB SCHEDULING BY USING BIGDATA

Multiple MapReduce Jobs in Distributed Scheduler for Big Data Applications

Energy-Aware Scheduling of MapReduce Jobs for Big Data Applications

Evaluating map reduce tasks scheduling algorithms over cloud computing infrastructure

Classification Framework of MapReduce Scheduling Algorithms

Survey on MapReduce Scheduling Algorithms

A Comprehensive View of Hadoop MapReduce Scheduling Algorithms

Improving MapReduce Scheduling Algorithm Using Prediction-Based Application Model in the Cloud

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

MapReduce Scheduling Algorithms Research Articles

Related Topics

Articles published on MapReduce Scheduling Algorithms

Jargon of Hadoop MapReduce scheduling techniques: a scientific categorization

MapReduce scheduling algorithms: a review

Exact and heuristic MapReduce scheduling algorithms for cloud federation

LEARN ON MAPREDUCE METHOD FOR JOB SCHEDULING BY USING BIGDATA

Multiple MapReduce Jobs in Distributed Scheduler for Big Data Applications

Energy-Aware Scheduling of MapReduce Jobs for Big Data Applications

Evaluating map reduce tasks scheduling algorithms over cloud computing infrastructure

Classification Framework of MapReduce Scheduling Algorithms

Survey on MapReduce Scheduling Algorithms

A Comprehensive View of Hadoop MapReduce Scheduling Algorithms

Improving MapReduce Scheduling Algorithm Using Prediction-Based Application Model in the Cloud