Abstract
Data cube computation is one of the most essential but expensive operations in data warehousing. Previous studies have developed two major approaches, top-down vs. bottom-up. For efficient cube computation in various data distributions, this chapter proposes an interesting cube computation method, Star-Cubing, that integrates the strength of both top-down and bottom-up cube computation, and explores a few additional optimization techniques. It utilizes a star-tree structure, extends the simultaneous aggregation methods, and enables the pruning of the group-by's that do not satisfy the iceberg condition. The performance study shows that Star-Cubing is highly efficient and outperforms all the previous methods in almost all kinds of data distributions. Two optimization techniques are emphasized: (1) shared aggregation by taking advantage of shared dimensions among the current cuboid and its descendant cuboids; and (2) prune as soon as possible the unpromising cells during the cube computation using the anti-monotonic property of the iceberg cube measure. No previous cubing method has fully explored both optimization methods in one algorithm. Moreover, a new compressed data structure, star-tree, is proposed using star nodes. In addition, a few other optimization techniques contribute to the high performance of the method.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.