The effect of deletions on different insertion disciplines for hash tables (Extended Abstract)

Patricio V Poblete,Alfredo Viola

doi:10.1016/s1571-0653(04)00246-x

Abstract

Abstract Abstract Deletions in open addressing hash tables are often handled by marking the cells as “deleted” instead of “empty”, because otherwise the search algorithm might fail to find some of the keys. The space used by deleted cells may be reused by subsequent insertions. Intuitively, search times should deteriorate as tables become contaminated with deleted cells and, as Knuth points out, in the long run the average successful search time should approach the average unsuccessful search time. We analize the effect of a long sequence of deletions and insertions over tables that use one of three insertion disciplines: standard “first-come-first-served” (FCFS)[2], “last-come-first-served” (LCFS)[3] and “Robin Hood” (RH)[1]. We show that deletions have the predicted effect over the average search cost, but their effect over the variance differs according to the insertion discipline used. When no deletions are allowed, FCFS has a very dispersed distribution, while those of LCFS and RH are very concentrated. But we show that, after many deletions and insertions, both FCFS and LCFS approach a common steady state with high dispersion, while the distribution of RH remains concentrated. We also study the transient behaviors of these methods, doing both an asymptotic and an exact analysis.

Full Text