Abstract

of thesis entitled Scalable Skyline Evaluation in Multidimensional and Partially Ordered Domains Submitted by Shiming ZHANG for the degree of Doctor of Philosophy in Computer Science at The University of Hong Kong in August 2011 The skyline query, as an elegant and sophisticated paradigm for flexible multicritera data analysis, has attracted a lot of attention in advanced database applications. Specifically, given a d-dimensional database D and a set of multidimensional preferences P, which involve partial or total orders of attributes, the skyline of D w.r.t. P is the superior subset of D which contains the points that are not dominated by any others in D on all dimensions. Here, an object o dominates another object o′, if and only if o is better than or as good as o′ in all dimensions and better than o′ in at least one dimension. The skyline points present a scale-free choice of multidimensional data points worthy of further consideration in many contexts. The problem in high dimensional spaces easily becomes CPU-intensive due to the large number of dominance tests. We focus on such problems to propose a dynamic indexing technique and organize skyline points in the tree, which is integrated into state-of-the-art sort-based skyline algorithms to boost their computational performance by orders of magnitude. The novel indexing and dominance checking approach is supported by a theoretical analysis, which scales well with the data dimensionality and cardinality due to not only tremendous savings of unnecessary dominance tests but also the efficiency of dominance checking with the help of bitwise operations. The problem with partially ordered attributes is seldom considered in the literature. A few prior methods with a partial-to-total mapping scheme adapt stronger notions of dominance, which generate false positives or require expensive dominance checks. We focus on this problem and propose two novel methods (i.e., CPS and SCL) which do not have these drawbacks. Our first method uses an appropriate mapping of embedding a partial order into chain products and follows an off-the-shelf skyline algorithm. The second technique uses a column-wise storage and indexing approach, which facilitates efficient incomparability verification. The empirical

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.