Cross-MapReduce: Data transfer reduction in geo-distributed MapReduce

Saeed Mirpour Marzuni,Abdorreza Savadi,Adel N. Toosi,Mahmoud Naghibzadeh

doi:10.1016/j.future.2020.09.009

Saeed Mirpour Marzuni, Abdorreza Savadi + Show 2 more

https://doi.org/10.1016/j.future.2020.09.009

Copy DOI

Abstract

The MapReduce model is widely used to store and process big data in a distributed manner. MapReduce was originally developed for a single tightly coupled cluster of computers. Approaches such as Hierarchical and Geo-Hadoop are designed to address geo-distributed MapReduce processing. However, these methods still suffer from high inter-cluster data transfer over the Internet, which is prohibitive for processing today’s globally big data. In line with our thinking that there is no need to transfer the entire intermediate results to a single global reducer, we propose Cross-MapReduce, a framework for geo-distributed MapReduce processing. Before any massive data transfer, our proposed method finds a set of best global reducers to minimize transferred data volumes. We propose a graph called Global Reduction Graph (GRG) to determine the number and the locations of the global reducers. We conducted extensive experimental evaluations using a real testbed to demonstrate the effectiveness of Cross-MapReduce. The experimental results show that Cross-MapReduce significantly outperforms the Hierarchical and Geo-Hadoop approaches and reduces the amount of data transfer over the Internet by 40%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Cross-MapReduce: Data transfer reduction in geo-distributed MapReduce

Abstract

Talk to us

Similar Papers

More From: Future Generation Computer Systems

Lead the way for us

Journal: Future Generation Computer Systems	Publication Date: Sep 11, 2020
Citations: 8

Similar Papers

Call for Papers—Interfaces Special Issue: Applications of Analytics and Operations Research in Big Data Analysis
Deepak S Turaga
Interfaces | VOL. 45
Deepak S TuragaDeepak S Turaga
01 Oct 2015
Interfaces | VOL. 45

An efficient scalable and flexible data transfer architecture for multiprocessor SoC with massive distributed memory
Sang-Il Han ... Amer Baghdadi
-
Sang-Il Han, et. al.Sang-Il Han ... Amer Baghdadi
07 Jun 2004
07 Jun 2004

Cloud computing and big data: Technologies and applications
Mostapha Zbakh ... Mohamed Bakhouya
Concurrency and Computation: Practice and Experience | VOL. 29
Mostapha Zbakh, et. al.Mostapha Zbakh ... Mohamed Bakhouya
29 Mar 2017
Concurrency and Computation: Practice and Experience | VOL. 29

A novel system architecture for secure authentication and data sharing in cloud enabled Big Data Environment
Uma Narayanan ... Shelbi Joseph
Journal of King Saud University - Computer and Information Sciences | VOL. 34
Uma Narayanan, et. al.Uma Narayanan ... Shelbi Joseph
19 May 2020
Journal of King Saud University - Computer and Information Sciences | VOL. 34

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Cross-MapReduce: Data transfer reduction in geo-distributed MapReduce

Abstract

Talk to us

Similar Papers

More From: Future Generation Computer Systems