Abstract

An effective optimization strategy for evaluating statistical queries is to use precomputed summary data on certain categories. An important step in this strategy is to compare categories for containment in order to decide whether the summary data on one category can be used to compute the summary on another. This paper studies optimization for such comparisons. A category in this paper is represented by a relation whose attributes are partitioned into pair-wise disjoint sets, each called a dimension. A category is said to be orthogonal if it is equal to the cross product of the projections of itself on all the dimensions, and k-partially orthogonal if it is the union of k orthogonal ones. Comparing k-partially orthogonal categories for containment is computationally much easier than comparing arbitrary categories if k is small and all the orthogonal subcategories are known. It is shown however that it is computationally intractable (NP-hard) to partition an arbitrarily given category into the smallest number of orthogonal subcategories. In order to avoid this intractable task but still take advantage of orthogonality, this paper investigates methods that derive orthogonality in categories which are results of relational queries, assuming the orthogonality in input categories is known. The methods are based on a careful examination of each relational operation and on certain auxiliary constructs for labelling orthogonal subcategories.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call