Top-k dominating queries use an intuitive scoring function which ranks multidimensional points with respect to their dominance power, i.e., the number of points that a point dominates. The k points with the best (e.g., highest) scores are returned to the user. Both top-k and skyline queries have been studied in a streaming environment, where changes to the data set are very frequent. In such an environment, continuous query processing techniques are required toward efficient monitoring of query results, since periodic query re-execution is computationally intensive, and therefore, prohibitive. This work contains the first study of continuous top-k dominating queries over data streams. In comparison to continuous top-k and skyline queries, continuous top-k dominating queries pose additional challenges. Three exact algorithms (BFA, EVA, ADA) are studied, and among them ADA, which is enhanced with additional optimization techniques, shows the best overall performance. In some cases, we are willing to trade accuracy for speed. Toward this direction, two approximate algorithms are proposed (AHBA and AMSA). AHBA offers probabilistic guarantees regarding the accuracy of the result based on the Hoeffding bound, whereas AMSA performs a more aggressive computation resulting in more efficient processing. Evaluation results, based on real-life and synthetic data sets, show the efficiency and scalability of our techniques.
Read full abstract