Towards the design of optimal data redundancy schemes for heterogeneous cloud storage infrastructures

Lluis Pamies-Juarez,Pedro García-López,Marc Sánchez-Artigas,Blas Herrera

doi:10.1016/j.comnet.2010.11.004

Lluis Pamies-Juarez, Pedro García-López + Show 2 more

https://doi.org/10.1016/j.comnet.2010.11.004

Copy DOI

Abstract

Nowadays, data storage requirements from end-users are growing, demanding more capacity, more reliability and the capability to access information from anywhere. Cloud storage services meet this demand by providing transparent and reliable storage solutions. Most of these solutions are built on distributed infrastructures that rely on data redundancy to guarantee a 100% of data availability. Unfortunately, existing redundancy schemes very often assume that resources are homogeneous, an assumption that may increase storage costs in heterogeneous infrastructures – e.g., clouds built of voluntary resources. In this work, we analyze how distributed redundancy schemes can be optimally deployed over heterogeneous infrastructures. Specifically, we are interested in infrastructures where nodes present different online availabilities. Considering these heterogeneities, we present a mechanism to measure data availability more precisely than existing works. Using this mechanism, we infer the optimal data placement policy that reduces the redundancy used, and then its associated overheads. In heterogeneous settings, our results show that data redundancy can be reduced up to 70%.

Full Text