Abstract

Bloom filters (BFs) are space-efficient randomized data structures used for fast set membership queries in many applications. The standard BF allows a small fraction of false positives for high space efficiency, but does not support deletions of items. The state-of-the-art cuckoo filter (CF) extends the standard BF to support deletions while achieving higher performance and lower space cost. However, the CF suffers a critical issue of varying space cost per item. This is because the exclusive-OR (XOR) operation used by the CF requires the total number of buckets to be a power of two, leading to the inflation of space. To address the issue, in this paper we propose a scalable variant of the CF called tagged cuckoo filter (TCF). The TCF uses a tagged fingerprint of an item, instead of an untagged fingerprint used by the CF, to compute its two candidate bucket indexes. With a tag of a fingerprint, the TCF does not require the number of buckets to be a power of two, which results in low space cost per item. Experimental results show that the TCF improves the space efficiency while sustaining high performance. Compared to the CF, the TCF reduces up to 2x space cost per item as well as achieves comparable lookup and update performance. In addition, the TCF outperforms other filters in both space cost and performance for same false positive rates.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call