Fast set intersection in memory

Bolin Ding,Arnd Christian König

doi:10.14778/1938545.1938550

Fast set intersection in memory

Bolin Ding, Arnd Christian König

Open Access

https://doi.org/10.14778/1938545.1938550

Copy DOI

Journal: Proceedings of the VLDB Endowment	Publication Date: Jan 1, 2011
Citations: 100

Affiliation: University of Illinois Urbana-Champaign, Microsoft (United States)

#Number Of Bits #Set Intersection + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Set intersection is a fundamental operation in information retrieval and database systems. This paper introduces linear space data structures to represent sets such that their intersection can be computed in a worst-case efficient way. In general, given k (preprocessed) sets, with totally n elements, we will show how to compute their intersection in expected time [EQUATION], where r is the intersection size and w is the number of bits in a machine-word. In addition, we introduce a very simple version of this algorithm that has weaker asymptotic guarantees but performs even better in practice; both algorithms outperform the state of the art techniques for both synthetic and real data sets and workloads.

Full Text