Birdwatching

Novak Boskov,Ari Trachtenberg,David Starobinski

doi:10.1145/3426746.3434063

Abstract

Cuckoo filters are a probabilistic data structure for approximate membership queries. Their lookup queries are designed to return either probably in the set (with probability of error e) or definitely not in the set. We show that the latter does not necessarily hold in practice, meaning that these filters may suffer from both false positives and false negatives. Specifically, we analyze state-of-the-art cuckoo filter implementations, and identify a source of false negatives arising from an interplay between partial-key hashing and cuckoo evictions in filters that are close to full. We further show that for practical implementations of cuckoo filters, there is a trade-off between space efficiency and incurring a certain amount of false negatives. Finally, we compare state-of-the-art cuckoo filter implementations with their Bloom filter counterparts. We show that for a false positive rate below 3%, Bloom filters achieve better space efficiency than cuckoo filters for most of the filter sizes.

Full Text