Abstract
Skyline queries have been widely used as an effective query tool in many contemporary database applications. The main concept of skyline queries relies on retrieving the non-dominated tuples in the database which are known skylines. In most database applications, the contents of the databases are dynamic due to the continuous changes made towards the database. Typically, the changes in the contents of the database occur through data manipulation operations (INSERT and/or UPDATE). Performing these operations on the database results in invalidating the most recent skylines before changes are made on the database. Furthermore, the presence of incomplete data in databases becomes frequent phenomena in recent database applications. Data incompleteness causes several challenges on skyline queries such as losing the transitivity property of the skyline technique and the test dominance process between tuples being cyclic . Reapplying skyline technique on the entire updated incomplete database to determine the new skylines is unwise due to the exhaustive pairwise comparisons. Thus, this paper proposes an approach, named Incomplete Dynamic Skyline Algorithm (IDSA) which attempts to determine the skylines on dynamic and incomplete databases. Two optimization techniques have been incorporated in IDSA, namely: pruning and selecting superior local skylines. The pruning process attempts to exploit the derived skylines before the INSERT/UPDATE operation made on the database to identify the new skylines. Moreover, selecting superior local skylines process assists in further eliminating the remaining non-skylines from further processing. These two optimization techniques lead to a large reduction in the number of domination tests due to avoiding re-computing of skylines over the entire updated database to derive the new skylines. Extensive experiments have been accomplished on both real and synthetic datasets, and the results demonstrate that IDSA outperforms the existing solutions in terms of the number of domination tests and the processing time of the skyline operation.
Highlights
Traditional queries operate in a very non-flexible manner as they either return data from a database that strictly satisfies the conditions given in the submitted query or return no result if otherwise
Among the most remarkable variation of skyline technique designed for a database with complete data are Divide-and-Conquer (D&C), Block Nested- Loop (BNL) [14], Bitmap and Index [15], Sort Filter Skyline (SFS) [16], Branch and Bound Skyline (BBS) [19], Linear Elimination Sort Skyline (LESS) [17], Sort and Limit Skyline algorithm (SaLSa) [18], Nearest Neighbor (NN) [20], ZSearch [21], and OSPS [22].the assumption of data completeness assures that all tuples are comparable against each other, and performing the pairwise comparisons is straightforward and results in identifying the skyline results
In this paper, a new skyline solution called Incomplete Dynamic Skyline Algorithm (IDSA) is proposed which is capable of retrieving the skylines over a dynamic and incomplete database in which the database state changed due to the insert operation performed towards the initial incomplete database
Summary
Traditional queries operate in a very non-flexible manner as they either return data from a database that strictly satisfies the conditions given in the submitted query or return no result if otherwise. Based on the most recent information in the bar database, it can be noticed that the bar b9 which has been reported as skyline before the insert operation has been dominated by the newly inserted bar b13 based on the rating dimension This indicates that b9 is no longer a valid skyline and should be removed from the skyline result. It is unwise and impractical to directly apply the skyline technique on the entire database after changes are made to compute the new skylines This is due to the fact that not all tuples are affected by the performed insert operation. ● The problem of processing skyline queries in an incomplete and dynamic database where values of certain dimensions of tuples are missing and the contents of the database are frequently updated through data manipulation operations (insert and update) has been highlighted. The conclusion is described in the final section, Section VI
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.