An efficient parallel processing method for skyline queries in MapReduce

Junsu Kim,Myoung Ho Kim

doi:10.1007/s11227-017-2171-y

Abstract

Skyline queries are useful for finding only interesting tuples from multi-dimensional datasets for multi-criteria decision making. To improve the performance of skyline query processing for large-scale data, it is necessary to use parallel and distributed frameworks such as MapReduce that has been widely used recently. There are several approaches which process skyline queries on a MapReduce framework to improve the performance of query processing. Some methods process a part of the skyline computation in a serial manner, while there are other methods that process all parts of the skyline computation in parallel. However, each of them suffers from at least one of two drawbacks: (1) the serial computations may prevent them from fully utilizing the parallelism of the MapReduce framework; (2) when processing the skyline queries in a parallel and distributed manner, the additional overhead for the parallel processing may outweigh the benefit gained from parallelization. In order to efficiently process skyline queries for large data in parallel, we propose a novel two-phase approach in MapReduce framework. In the first phase, we start by dividing the input dataset into a number of subsets (called cells) and then we compute local skylines only for the qualified cells. The outer-cell filter used in this phase considerably improves the performance by eliminating a large number of tuples in unqualified cells. In the second phase, the global skyline is computed from local skylines. To separately determine global skyline tuples from each local skyline in parallel, we design the inner-cell filter and also propose efficient methods to reduce the overhead caused by computing and utilizing the inner-cell filters. The primary advantage of our approach is that it processes skyline queries fast and in a fully parallelized manner in all states of the MapReduce framework with the two filtering techniques. Throughout extensive experiments, we demonstrate that the proposed approach substantially increases the overall performance of skyline queries in comparison with the state-of-the-art skyline processing methods. Especially, the proposed method achieves remarkably good performance and scalability with regard to the dataset size and the dimensionality. Our approach has significant benefits for large-scale query processing of skylines in distributed and parallel computing environments.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An efficient parallel processing method for skyline queries in MapReduce

Abstract

Talk to us

Similar Papers

More From: The Journal of Supercomputing

Lead the way for us

Journal: The Journal of Supercomputing	Publication Date: Oct 31, 2017
Citations: 8

Similar Papers

Efficient execution plans for distributed skyline query processing
João B Rocha-Junior ... Akrivi Vlachou
-
João B Rocha-Junior, et. al.João B Rocha-Junior ... Akrivi Vlachou
21 Mar 2011
21 Mar 2011

Application of processing technology based on skyline query in computer network
Yifu Zeng ... Chuang Li
Neural Computing and Applications | VOL. 34
Yifu Zeng, et. al.Yifu Zeng ... Chuang Li
01 Apr 2021
Neural Computing and Applications | VOL. 34

Efficient Processing of Metric Skyline Queries
Lei Chen ... Xiang Lian
IEEE Transactions on Knowledge and Data Engineering | VOL. 21
Lei Chen, et. al. Lei Chen ... Xiang Lian
01 Mar 2009
IEEE Transactions on Knowledge and Data Engineering | VOL. 21

Efficient progressive processing of skyline queries in peer-to-peer systems
Huajing Li ... Wang-Chien Lee
-
Huajing Li, et. al.Huajing Li ... Wang-Chien Lee
01 Jan 2006
01 Jan 2006

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An efficient parallel processing method for skyline queries in MapReduce

Abstract

Talk to us

Similar Papers

More From: The Journal of Supercomputing