Abstract

Top-k joins have been extensively studied when numerical valued attributes are joined on an equality predicate. Other types of join attributes and predicates have received little to no attention. In this paper, we consider spatial objects that are assigned a score (e.g., a ranking). Give two collections R, S of such objects and a spatial distance threshold 𝜖, we introduce the top-k spatial distance join (k-SDJoin) to identify the k pairs of objects, which have the highest combined score (based on an aggregate function γ) among all object pairs in R × S with a spatial distance at most 𝜖. State-the-of-art methods for relational top-k joins can be adapted for k-SDJoin, but their focus is on minimizing the number of objects accessed from the inputs; however, when spatial objects are joined, the computational cost can easily become the bottleneck. In view of this, we propose a novel evaluation algorithm, which greatly reduces the computational cost, without compromising the access cost. The main idea is to access and efficiently join blocks of objects from each collection, using appropriate bounds to avoid computing the entire spatial 𝜖-distance join. As the performance of our solution heavily relies on the size of the input blocks, we devise an approach for automated block size tuning enhanced by a novel generic model for estimating the number of objects to be accessed from each input. Contrary to previous efforts, our model employs cheap-to-compute statistics and requires no prior knowledge of data distribution. Our extensive experimental analysis demonstrates the efficiency of our algorithm compared to methods based on existing literature that prioritize either the ranking or the spatial join component of k-SDJoin queries.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.