Research on Deduplication Technology Based on B+ Tree Index and Hash Index

Haocheng Guan

doi:10.1109/eebda56825.2023.10090509

Abstract

With the continuous development of information technology and improve data has become decided to enterprise survival and the development of one of the key factors. However, with the increasing amount of data information, the storage and backup of massive data bring great pressure to the enterprise data center. Statistics show that there is a lot of duplicate data between the growing mass data, so the deduplication technology is put forward. As a new data compression technology, deduplication still faces many problems and challenges. In this paper, the repeated detection rate and performance metadata size of deduplication system are studied, and two solutions based on B+ tree Index and Hash Index are proposed, and their respective advantages and disadvantages are quantitatively evaluated and analyzed.

Full Text