Weaknesses of Cuckoo Hashing with a Simple Universal Hash Class: The Case of Large Universes

Martin Dietzfelbinger,Ulf Schellbach

doi:10.1007/978-3-540-95891-8_22

Abstract

Cuckoo hashing was introduced by Pagh and Rodler in 2001 [12]. A set S of n keys is stored in two tables T 1 and T 2 each of which has m cells of capacity 1 such that constant access time is guaranteed. For m ≥ (1 + e)n and hash functions h 1, h 2 that are c logn-wise independent, Pagh [11] showed that the keys of an arbitrary set S can be stored using h 1 and h 2 with a probability of 1 − O(1/n). Here we prove that a family of simple hash functions that can be evaluated fast is not sufficient to guarantee this behavior, namely there exists a “bad” set S of size ≅ (7/8) ·m for which the probability that the keys of S cannot be stored using h 1 and h 2 is Ω(1). Experiments indicate that the bad sets cause the cuckoo scheme to fail with a probability much larger than formally proved in our main theorem. Our result shows that care must be taken when using cuckoo hashing in combination with very simple hash classes, if a small failure probability is essential since frequent rehashing cannot be tolerated.

Full Text