Distance-based Join Queries (DJQs) have multiple applications in spatial databases, Geographic Information Systems, and other areas. The K Closest Pairs Query (KCPQ) and the ε Distance Join Query (εDJQ) are well-known DJQs that have been widely studied and can be solved using plane-sweep techniques, which are efficient but must keep the whole datasets in main memory. In this work, we propose DJQ algorithms that work with data represented using a k2-tree, a compact data structure for binary grids. Our algorithms solve KCPQ and εDJQ queries, as well as several window-constrained variants, taking advantage of the indexing capabilities of k2-trees to efficiently answer queries without the need to decompress the data. Our experimental evaluation with large datasets shows that k2-tree algorithms are up to 5 times faster than plane-sweep algorithms in KCPQ, and 5–30 times faster in εDJQ. In variants that are window-constrained, our algorithms are competitive in most scenarios and faster for large windows. Additionally, our algorithms are not very affected by the distribution of the data and yield much more predictable query times, showing up to 30 times smaller variance in query times than plane sweep, depending on the location of the query window.
Read full abstract