Efficient execution plans for distributed skyline query processing

João B. Rocha-Junior,Kjetil Nørvåg,Christos Doulkeridis,Akrivi Vlachou

doi:10.1145/1951365.1951399

Abstract

In this paper, we study the generation of efficient execution plans for skyline query processing in large-scale distributed environments. In such a setting, each server stores autonomously a fraction of the data, thus all servers need to process the skyline query. An execution plan defines the order in which the individual skyline queries are processed on different servers, and influences the performance of query processing. Querying servers consecutively reduces the amount of transferred data and the number of queried servers, since skyline points obtained by one server prune points in the subsequent servers, but also increases the latency of the system. To address this trade-off, we introduce a novel framework, called SkyPlan, for processing distributed skyline queries that generates execution plans aiming at optimizing the performance of query processing. Thus, we quantify the gain of querying consecutively different servers. Then, execution plans are generated that maximize the overall gain, while also taking into account additional objectives, such as bounding the maximum number of hops required for the query or balancing the load on different servers fairly. Finally, we present an algorithm for distributed processing based on the generated plan that continuously refines the execution plan during in-network processing. Our framework consistently outperforms the state-of-the-art algorithm.

Full Text