Abstract

This paper focuses on the solution to overcome the problems caused due to physically scattered chunks of data. Fragmentation can occur in the form of sparse containers or containers that are not in order. Restore speed and garbage collection efficiency are compromised due to these containers. The disordered container triggers decline in restore speed owing to the decrease in restore cache. The idea of diminishing fragmentation is showcased by the proposal of History-Aware Rewriting (HAR) algorithm. HAR uses some of the historical information of the backups that have occurred to recognize and reduce sparse containers. Each of the chunks is given a unique hash code by the hash code generator Message Digest 5 (MD5). The logical block address is used to merge all the blocks and obtain the original single file. The Data Encryption Standard (DES) is used to generate a secret key file which is given to the user when the user is created by the data owner. Collectively using the above-mentioned algorithms, the proposed system aims to minimize fragmentation problem for in-line deduplication backup storage system. The amount of improvement of restore performance will depend on the amount of duplicate data. Simulation results show that if the same data is uploaded twice, the write performance rises up to 80% and further rises up to 90% for third instance of same data. This value varies in accordance with the deduplication technique. In case of no duplicate data at all, this model does not affect the system.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call