Abstract

Reliable data storage is crucial to the production, transmission, transaction, consumption, and analysis of an Energy Internet (EI). Whereas mainstream distributed data storage seems to be a plausible solution, the existing methods suffer from a tradeoff between the storage overhead (incurred by the replicas of data encodings for lossless recovery) and the communication latency (due to the spiking network traffic resulting from massive queries of data replicas across devices). To balance this tradeoff, we propose a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Lightweight Dynamic Storage Algorithm based on Adaptive Encoding</i> (LDSA-AE) approach for EI data storage. Our key idea is to classify the data into active and inactive categories, where the active data are most likely to be accessed and thus corrupted in high frequencies. As such, wherever the active data are housed, the replicas of them can be proactively allocated into a set of nearby devices. The main challenges are to realize the classification in real-time and to tailor encoding methods for the active and inactive separately in correspondence to their own characteristics. To overcome these, our LDSA-AE 1) proposes a novel density-based clustering algorithm to tackle performance data classification in an online and unsupervised fashion and 2) leverages Minimum Density RAID-6 (MDR) code and Cauchy Reed-Solomon (CRS) code for active and inactive data encodings, respectively, striving to ensure data storage with low overhead, low latency, high reliability, and high throughput at once. A theoretical analysis substantiates the viability and effectiveness of our proposed LDSA-AE approach. We also prototype our LDSA-AE on a real-world server testbed, and the empirical study suggests the superiority of our approach over the state-of-the-art distributed storage schemes for EI in terms of storage overhead, repair throughput, and reliability.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call