Abstract

The emergence of large-scale dynamic sets in networked and distributed applications attaches stringent requirements to approximate set representation. The existing data structures (including Bloom filter, Cuckoo filter, and their variants) preserve a tight dependency between the cells or buckets for an element and the lengths of the filters. This dependency, however, degrades the capacity elasticity, space efficiency and design flexibility of these data structures when representing dynamic sets. In this paper, we first propose the Index-Independent Cuckoo filter (I2CF), a probabilistic data structure that decouples the dependency between the length of the filter and the indices of buckets which store the information of elements. At its core, an I2CF maintains a consistent hash ring to assign buckets to the elements and generalizes the Cuckoo filter by providing optional <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">${k}$ </tex-math></inline-formula> candidate buckets to each element. By adding and removing buckets adaptively, I2CF supports the bucket-level capacity alteration for dynamic set representation. Moreover, in case of a sudden increase or decrease of set cardinality, we further organize multiple I2CFs as a Consistent Cuckoo filter (CCF) to provide the filter-level capacity elasticity. By adding untapped I2CFs or merging under-utilized I2CFs, CCF is capable of resizing its capacity instantly. The trace-driven experiments indicate that CCF outperforms its alternatives and realizes our design rationales for dynamic set representation simultaneously, at the cost of a little higher complexity.

Highlights

  • S ET representation while supporting membership queries is a fundamental problem in databases, caches, routers, storage, and distributed applications [1]

  • We compare the performance of Cuckoo filter (CCF) with Dynamic Cuckoo filter (DCF) for dynamic set representation and quantify the impact of the parameters

  • We present the CCF design for dynamic set representation and membership query, with the targets of capacity elasticity, space efficiency, and design flexibility

Read more

Summary

A Capacity-elastic Cuckoo Filter Design for Dynamic Set Representation

The existing data structures (including Bloom filter, Cuckoo filter, and their variants) preserve a tight dependency between the cells or buckets for an element and the lengths of the filters. This dependency, degrades the capacity elasticity, space efficiency and design flexibility of these data structures when representing dynamic sets. By adding and removing buckets adaptively, I2CF supports the bucket-level capacity alteration for dynamic set representation. In case of a sudden increase or decrease of set cardinality, we further organize multiple I2CFs as a Consistent Cuckoo filter (CCF) to provide the filter-level capacity elasticity. The trace-driven experiments indicate that CCF outperforms its alternatives and realizes our design rationales for dynamic set representation simultaneously, at the cost of a little higher complexity

INTRODUCTION
APPLICATIONS OF SKETCHES IN NETWORKS
Cuckoo Hash Table
Consistent Hashing
Cuckoo Filters
CONSISTENT CUCKOO FILTER
Dynamic Set Representation with CCF
Resizing operations of I2CF and CCF
Resizing Strategy in CCF
PERFORMANCE ANALYSIS OF CCF
Time-complexities of CCF
Threshold for CCF Insertion
Probability of a Successful Representation
EVALUATION
Comparison with DCF
Impact of Parameters in CCFB
Impact of Parameters in CCFF
Further Comparison between DCF and CCFF
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call