Abstract

Nowadays, big data is coming to the force in a lot of applications. Processing a skyline query on big data in more than linear time is by far too expensive and often even linear time may be too slow. It is obviously not possible to compute an exact solution to a skyline query in sublinear time, since an exact solution may itself have linear size. Fortunately, in many situations, a fast approximate solution is more useful than a slower exact solution. This paper proposes two sampling-based approximate algorithms for processing skyline queries. The first algorithm obtains a fixed size sample and computes the approximate skyline on it. The error of the algorithm is not only relatively small in most cases, but also is almost unaffected by the input size. The second algorithm returns an [Formula: see text]-approximation for the exact skyline efficiently. The running time of the algorithm has nothing to do with the input size in practical, achieving the goal of sublinearity on big data. Experiments verify the error analysis of the first algorithm, and show that the second is much faster than the existing skyline algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call