Abstract

The skyline query has recently attracted a considerable amount of research interest in several fields. The query conducts computations using the domination test, where “domination” means that a data point does not have a worse value than others in any dimension, and has a better value in at least one dimension. Therefore, the skyline query can be used to construct efficient queries based on data from a variety of fields. However, when the number of dimensions or the amount of data increases, naïve skyline queries lead to a degradation in overall performance owing to the higher cost of comparisons among data. Several methods using index structures have been proposed to solve this problem but have not improved the performance of skyline queries because their indices are heavily influenced by the dimensionality and data amount. Therefore, in this study, we propose HI-Sky, a method that can perform quick skyline computations by using the hash index to overcome the above shortcomings. HI-Sky effectively manages data through the hash index and significantly improves performance by effectively eliminating unnecessary data comparisons when computing the skyline. We provide the theoretical background for HI-Sky and verify its improvement in skyline query performance through comparisons with prevalent methods.

Highlights

  • The skyline query [1] returns data points that are not dominated by other data points in a given database

  • Data point B may not dominate other data points C and D. As these data do not satisfy the conditions of Lemma 1, i.e., neither point is in a partition with smaller corresponding dimensions than the other point given the order of the partition, Lemma 1 shows that hash index-based skyline (HI-Sky) can prune the data space using GLAD and can remove a large amount of data early, which significantly reduces the number of dominance tests

  • We proposed a hash index structure and a special hash key for a skyline query

Read more

Summary

Introduction

The skyline query [1] returns data points that are not dominated by other data points in a given database. Research interest has tended toward index structure-based methods due to their efficiency in handling large amounts of data. BBS and Z-SKY are not suitable for skyline computation when the data the data frequently changes, as these methods require a large number of resources to maintain the frequently changes, as these methods require a large number of resources to maintain the indices indices [14,15]. We propose a hash index-based skyline (HI-Sky), which is a skyline query method. The results our experiment demonstrate that HI-Sky generate indices indices faster otherand methods perform query processing at higher faster than otherthan methods performand faster skylinefaster queryskyline processing at higher dimensions by dimensions by effectively reducing the number of dominance tests.

Related Study
Traditional Skyline Computation
Index Based Skyline Computation
Parallel and Distributed Skyline Computation
Hash Index-Based Skyline Query Processing
Hash Index for Skyline
Background
Given d-dimensional space Sorder
Data Space Pruning Step
Skyline Computation Step
Skyline Computation using HI-Sky
Performance Evaluation
Experimental Environment
Comparison of Changes in np
Comparison of Indexing Time
Indexing
Comparison of Skyline Computation Time
Comparison
Comparisons Using Real-World Dataset
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.