Dynamic maintenance of multidimensional range data partitioning for parallel data processing

Junping Sun,William I Grosky

doi:10.1145/294260.294275

Abstract

Star schema has been a typical model for both online transaction processing in traditional databases and online analytical processing in large data warehouses. In the star schema, the dominant volumes of data are stored in the relationship table in terms of databases or the fact table in terms of data warehouses. Sometimes this relationship or fact table is called multidimensional table, cube, or data set. In this paper, we present a parallel method to partition the fact table in terms of multidimensional space for parallel star query processing. Also a dynamic approach to maintain load balance among all the processors is given in terms of a set of heuristics for the cases when the fact table undergoes frequent updates such as insertions/deletions. The multidimensionally partitioned data sets in the fact table are stored as leaf nodes in a multidimensional range tree, and each data set stored in the leaf node is mapped into each processor for parallel data partitioning and star query processing. As far as load balance is concerned in each of processors, we try to maintain the distribution of data volumes as uniform as possible by the set of heuristics for the star query processing in OLAP.

Full Text