Abstract Redundancy elimination or deduplication over network packets requires significant computing resources to find basic units of repeated contents, called chunks, by checking every byte in every packet. In this paper, we present the first constant-time chunking algorithm that divides every packet into a predefined number of chunks, irrespective of the packet size. In addition, we present the best implementation practice for packet-level deduplication by selecting an optimal combination of chunking, fingerprinting, and hash table algorithms. Through experiments with real traffic, we confirm that the throughput is improved by three times, compared with even the state-of-the-art scheme.
Read full abstract