Efficient Hybrid Inline and Out-of-Line Deduplication for Backup Storage

Yan-Kit Li,Min Xu,Chun-Ho Ng,Patrick P C Lee

doi:10.1145/2641572

Abstract

Backup storage systems often remove redundancy across backups via inline deduplication, which works by referring duplicate chunks of the latest backup to those of existing backups. However, inline deduplication degrades restore performance of the latest backup due to fragmentation, and complicates deletion of expired backups due to the sharing of data chunks. While out-of-line deduplication addresses the problems by forward-pointing existing duplicate chunks to those of the latest backup, it introduces additional I/Os of writing and removing duplicate chunks. We design and implement RevDedup , an efficient hybrid inline and out-of-line deduplication system for backup storage. It applies coarse-grained inline deduplication to remove duplicates of the latest backup, and then fine-grained out-of-line reverse deduplication to remove duplicates from older backups. Our reverse deduplication design limits the I/O overhead and prepares for efficient deletion of expired backups. Through extensive testbed experiments using synthetic and real-world datasets, we show that RevDedup can bring high performance to the backup, restore, and deletion operations, while maintaining high storage efficiency comparable to conventional inline deduplication.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Efficient Hybrid Inline and Out-of-Line Deduplication for Backup Storage

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Storage

Lead the way for us

Journal: ACM Transactions on Storage	Publication Date: Dec 29, 2014
Citations: 39

Similar Papers

InDe: An Inline Data Deduplication Approach via Adaptive Detection of Valid Container Utilization
Lifang Lin ... Yi Zhou
ACM Transactions on Storage | VOL. 19
Lifang Lin, et. al.Lifang Lin ... Yi Zhou
11 Jan 2023
ACM Transactions on Storage | VOL. 19

ERP: An Efficient Rewrite Scheme to Improve the Inline Deduplication Restore Performance in Backup Systems
Weidong Liu ... Chentao Wu
-
Weidong Liu, et. al.Weidong Liu ... Chentao Wu
01 Jan 2023
01 Jan 2023

QuickCDC: A Quick Content Defined Chunking Algorithm Based on Jumping and Dynamically Adjusting Mask Bits
Zhen Xu ... Wenbo Zhang
-
Zhen Xu, et. al.Zhen Xu ... Wenbo Zhang
01 Sep 2021
01 Sep 2021

Space-efficient and high-performance inline deduplication for emerging hybrid storage system with Libra＋
Renhui Chen ... Jiwu Shu
Journal of Systems Architecture | VOL. 150
Renhui Chen, et. al.Renhui Chen ... Jiwu Shu
08 Apr 2024
Journal of Systems Architecture | VOL. 150

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficient Hybrid Inline and Out-of-Line Deduplication for Backup Storage

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Storage