Migration-Aware Genetic Optimization for MapReduce Scheduling and Replica Placement in Hadoop

Carlos Guerrero,Isaac Lera,Carlos Juiz

doi:10.1007/s10723-018-9432-8

Abstract

This work addresses the optimization of file locality, file availability, and replica migration cost in a Hadoop architecture. Our optimization algorithm is based on the Non-dominated Sorting Genetic Algorithm-II and it simultaneously determines file block placement, with a variable replication factor, and MapReduce job scheduling. Our proposal has been tested with experiments that considered three data center sizes (8, 16 and 32 nodes) with the same workload and number of files (150 files and 3519 file blocks). In general terms, the use of a placement policy with a variable replica factor obtains higher improvements for our three optimization objectives. On the contrary, the use of a job scheduling policy only improves these objectives when it is used along a variable replication factor. The results have also shown that the migration cost is a suitable optimization objective as significant improvements up to 34% have been observed between the experiments.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Migration-Aware Genetic Optimization for MapReduce Scheduling and Replica Placement in Hadoop

Abstract

Talk to us

Similar Papers

More From: Journal of Grid Computing

Lead the way for us

Journal: Journal of Grid Computing	Publication Date: Feb 14, 2018
Citations: 26

Similar Papers

Processing Shotgun Proteomics Data on the Amazon Cloud with the Trans-Proteomic Pipeline
Joseph Slagel ... Robert L Moritz
Molecular & Cellular Proteomics | VOL. 14
Joseph Slagel, et. al.Joseph Slagel ... Robert L Moritz
01 Feb 2015
Molecular & Cellular Proteomics | VOL. 14

CLQLMRS: improving cache locality in MapReduce job scheduling using Q-learning
Rana Ghazali ... Sahar Adabi
Journal of Cloud Computing | VOL. 11
Rana Ghazali, et. al.Rana Ghazali ... Sahar Adabi
19 Sep 2022
Journal of Cloud Computing | VOL. 11

Dynamic-Based Clustering for Replica Placement in Data Grids
Rahma Souli Jbali ... Minyar Sassi Hidri
International Journal of Service Science, Management, Engineering, and Technology | VOL. 10
Rahma Souli Jbali, et. al.Rahma Souli Jbali ... Minyar Sassi Hidri
01 Oct 2019
International Journal of Service Science, Management, Engineering, and Technology | VOL. 10

Minimizing data access latency in data grids by neighborhood‐based data replication and job scheduling
Mahsa Beigrezaei ... Seyedeh Leili Mirtaheri
International Journal of Communication Systems | VOL. 33
Mahsa Beigrezaei, et. al.Mahsa Beigrezaei ... Seyedeh Leili Mirtaheri
09 Aug 2020
International Journal of Communication Systems | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Migration-Aware Genetic Optimization for MapReduce Scheduling and Replica Placement in Hadoop

Abstract

Talk to us

Similar Papers

More From: Journal of Grid Computing