A Fast Access Big Data Approach for Configurable and Scalable Object Storage Enabling Mixed Fault-Tolerance

Carlos Roberto Valêncio,André Francisco Morielo Caetano,Angelo Cesar Colombini,Márcio Zamboti Fortes,Mário Luiz Tronco

doi:10.3844/jcssp.2017.192.198

Carlos Roberto Valêncio, André Francisco Morielo Caetano + Show 3 more

Open Access

https://doi.org/10.3844/jcssp.2017.192.198

Copy DOI

Abstract

The progressive growth in the volume of digital data has become a technological challenge of great interest in the field of computer science. That comes because, with the spread of personal computers and networks worldwide, content generation is taking larger proportions and very different formats from what had been usual until then. To analyze and extract relevant knowledge from these masses of complex and large volume data is particularly interesting, but before that, it is necessary to develop techniques to encourage their resilient storage. Very often, storage systems use a replication scheme for preserving the integrity of stored data. This involves generating copies of all information that, if lost by individual hardware failures inherent in any massive storage infrastructure, do not compromise access to what was stored. However, it was realized that accommodate such copies requires a real storage space often much greater than the information would originally occupy. Because of that, there is error correction codes, or erasure codes, which has been used with a mathematical approach considerably more refined than the simple replication, generating a smaller storage overhead than their predecessors techniques. The contribution of this work is a fully decentralized storage strategy that, on average, presents performance improvements of over 80% in access latency for both replicated and encoded data, while minimizing by 55% the overhead for a terabyte-sized dataset when encoded and compared to related works of the literature.

Highlights

Large-scale data, or Big Data, resilient storage is one of the major problems addressed in terms of infrastructure support in computer science (Alnafoosi and Steinbach, 2013) (Hashem et al, 2015)
Later in related works we find Robot storage architecture (Yin et al, 2013), which relies on the sole usage of erasure codes for data storage and ignores replication as combined approach, which according to recent studies may be an error (Gribaudo et al, 2016)
The measurements incurred from the variation of the size of objects stored with three-way replication or with erasure coding

Summary

Introduction

Large-scale data, or Big Data, resilient storage is one of the major problems addressed in terms of infrastructure support in computer science (Alnafoosi and Steinbach, 2013) (Hashem et al, 2015) This means that when it comes to valuable information, storage systems design needs planning in such a way that no data is ever lost, regardless of external faults or factors common to any computational environment, such as hard disk failures and server crashes. In this sense, many of the existing state-of-the-art technologies use a replication methodology, which consists of entirely copying and storing data at different locations, often geographically distant, adding a degree of redundancy (Gonizzi et al, 2015).

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Fast Access Big Data Approach for Configurable and Scalable Object Storage Enabling Mixed Fault-Tolerance

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Computer Science

Lead the way for us

Journal: Journal of Computer Science	Publication Date: Jun 1, 2017
License type: cc-by

Similar Papers

A hybrid erasure-coded ECC scheme to improve performance and reliability of solid state drives
Pradeep Subedi ... Ming Zhang
-
Pradeep Subedi, et. al.Pradeep Subedi ... Ming Zhang
01 Dec 2014
01 Dec 2014

Consistency and fault tolerance for erasure-coded distributed storage systems
Kathrin Peter ... Alexander Reinefeld
-
Kathrin Peter, et. al.Kathrin Peter ... Alexander Reinefeld
19 Jun 2012
19 Jun 2012

Error correction and erasure codes for robust network steganography
Jörg Keller ... Peter Sobe
Journal of Systems Architecture | VOL. 153
Jörg Keller, et. al.Jörg Keller ... Peter Sobe
01 Jun 2024
Journal of Systems Architecture | VOL. 153

On the scalability of Big Data Cyber Security Analytics systems
Faheem Ullah ... M Ali Babar
Journal of Network and Computer Applications | VOL. 198
Faheem Ullah, et. al.Faheem Ullah ... M Ali Babar
03 Dec 2021
Journal of Network and Computer Applications | VOL. 198

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Fast Access Big Data Approach for Configurable and Scalable Object Storage Enabling Mixed Fault-Tolerance

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Computer Science