Abstract

We consider the skyline problem (aka the maxima problem ), which has been extensively studied in the database community. The input is a set P of d -dimensional points. A point dominates another if the coordinate of the former is at most that of the latter on every dimension. The goal is to find the skyline , which is the set of points p ∈ P such that p is not dominated by any other point in P . The main result of this article is that, for any fixed dimensionality d ≥ 3, in external memory the skyline problem can be settled by performing O (( N / B )log M/B d−2 ( N / B )) I/Os in the worst case, where N is the cardinality of P, B the size of a disk block, and M the capacity of main memory. Similar bounds can also be achieved for computing several skyline variants, including the k-dominant skyline, k-skyband , and α-skyline . Furthermore, the performance can be improved if some dimensions of the data space have small domains. When the dimensionality d is not fixed, the challenge is to outperform the naive algorithm that simply checks all pairs of points in P × P . We give an algorithm that terminates in O (( N / B ) log d − 2 N ) I/Os, thus beating the naive solution for any d = O (log N / log log N ).

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.