Abstract

SummaryAlthough deduplication can reduce data volume for backup, it pauses the running system for the purpose of data consistency. This problem becomes severe when the target data are Virtual Machine Image (VMI), the volume of which can scale up to several gigabytes. In this paper, we propose an online framework for VM image backup and recovery, called VMBackup, which comprises three major components: (1) Similarity Retrieval that indexes chunks' fingerprints by its segment id for fast identification, (2) one‐level File‐Index that efficiently tracks file id to its content chunks in a correct order, and (3) Adjacent Storage model that places adjacent chunks of an image in the same disk partition to maximize chunk locality. The experimental results show that (1) the images of one OS serial and the same custom can share high percentage of duplicated contents, (2) variable‐length chunk partitioning is superior to fixed‐length chunk partitioning for deduplication, and (3) VMBackup, in our environment, can provide 8M/s backup throughput and 9.5M/s recovery throughput, which are only 15% and 4% less than storage systems without deduplication. Copyright © 2015 John Wiley & Sons, Ltd.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call