Abstract
The database community has observed in the past two decades, the growth of research interest in preference queries, each of which has its unique techniques, benefits, and drawbacks. One of them is skyline queries. Skyline queries aim to report to users interesting objects based on their preferences. Yet, they are not without their limitations. Hence, this paper focuses on efficiently extending skyline query processing to support the uncertainty in dimensions, which in this paper is defined as uncertain dimension. To process skyline queries on data with uncertain dimensions, we propose SkyQUD algorithm, where it provides a mechanism that will partition the dataset according to the characteristics of each object before skyline dominance tests are performed. In the pruning process, we utilise a probability threshold value τ to accommodate the large skyline size reported by SkyQUD due to the computed probabilities. The algorithm has been validated through extensive experiments. Its results exhibit that skyline queries can be performed effectively on uncertain dimensions, and the proposed algorithm is efficient in query answering and capable of handling large datasets.
Highlights
I N a database system, conventional SQL queries are acknowledged for having strict constraints and reporting an exact match and complete result set
In order to keep the comparisons between groups of local skyline candidates simple, the mismatch probability dominance test will treat the group Gm that has the corresponding list of uncertain dimensions ΘGm = {0} as a set of initial global skyline candidates, and the group will be compared to the remaining groups Gn, n = m
We have defined the concept of uncertain dimensions and proposed an efficient algorithm, SkyQUD, to answer skyline query on data with uncertain dimensions
Summary
I N a database system, conventional SQL queries are acknowledged for having strict constraints and reporting an exact match and complete result set. Various researches were done to handle skyline query processing based on the above uncertainty models These researches were done with the assumption that either (i) uncertainty in data is caused by multiple existences of instances that represent an object [1], [6], [37], [51], or (ii) the representation of values in the form of continuous ranges in a dimension causes the uncertainty [21]. We further enhance and explicate the three methods proposed in the SkyQUD algorithm to reduce the searching space of skyline queries on data with uncertain dimensions.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.