Scalable data management for map-reduce-based data-intensive applications: a view for cloud and hybrid infrastructures

Gabriel Antoniu,Kate Keahey,Bogdan Nicolae,Anthony Simonet,Frédéric Suter,Christophe Blanchet,Gilles Fedak,Julien Bigot,Frédéric Desprez,Raphael Terreux,Sylvain Gault,François Briant,Christian Pérez,Alexandru Costan,Franck Cappello,Bing Tang,Luc Bougé

doi:10.1504/ijcc.2013.055265

Abstract

As map-reduce emerges as a leading programming paradigm for data-intensive computing, today’s frameworks which support it still have substantial shortcomings that limit its potential scalability. In this paper, we discuss several directions where there is room for such progress: they concern storage efficiency under massive data access concurrency, scheduling, volatility and fault-tolerance. We place our discussion in the perspective of the current evolution towards an increasing integration of large-scale distributed platforms (clouds, cloud federations, enterprise desktop grids, etc.). We propose an approach which aims to overcome the current limitations of existing map-reduce frameworks, in order to achieve scalable, concurrency-optimised, fault-tolerant map-reduce data processing on hybrid infrastructures. This approach will be evaluated with real-life bio-informatics applications on existing Nimbus-powered cloud testbeds interconnected with desktop grids.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Scalable data management for map-reduce-based data-intensive applications: a view for cloud and hybrid infrastructures

Abstract

Talk to us

Similar Papers

More From: International Journal of Cloud Computing

Lead the way for us

Journal: International Journal of Cloud Computing	Publication Date: Apr 20, 2012
Citations: 40

Similar Papers

BIGhybrid: a simulator for MapReduce applications in hybrid distributed infrastructures validated with the Grid5000 experimental platform
Julio C S Anjos ... Claudio F R Geyer
Concurrency and Computation: Practice and Experience | VOL. 28
Julio C S Anjos, et. al.Julio C S Anjos ... Claudio F R Geyer
22 Sep 2015
Concurrency and Computation: Practice and Experience | VOL. 28

On Resource Volatility in Enterprise Desktop Grids
...
-
, et. al. ...
04 Dec 2006
04 Dec 2006

Characterizing resource availability in enterprise desktop grids
Derrick Kondo ... Henri Casanova
Future Generation Computer Systems | VOL. 23
Derrick Kondo, et. al.Derrick Kondo ... Henri Casanova
24 Jan 2007
Future Generation Computer Systems | VOL. 23

Availability Traces of Enterprise Desktop Grids
Derrick Kondo ... Andrew Chien
-
Derrick Kondo, et. al.Derrick Kondo ... Andrew Chien
01 Jan 2006
01 Jan 2006

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Scalable data management for map-reduce-based data-intensive applications: a view for cloud and hybrid infrastructures

Abstract

Talk to us

Similar Papers

More From: International Journal of Cloud Computing