Abstract

A range query applies an aggregation operation (e.g., SUM) over all selected cells of an OLAP data cube where the selection is specified by providing ranges of values for numeric dimensions. Range sum queries on data cubes are a powerful analysis tool. Many application domains require that data cubes are updated often and the information provided by analysis tools are current or “near current”. Existing techniques for range sum queries on data cubes, however, can incur update costs in the order of the size of the data cube. Since the size of a data cube is exponential in the number of its dimensions, rebuilding the entire data cube can be very costly and is not realistic. To cope with this dynamic data cube problem, a new approach has been introduced recently, which achieves constant time per range sum query while constraining each update cost within O( n d/2 ), where d is the number of dimensions of the data cube and n is the number of distinct values of the domain at each dimension. In this paper, we provide a new algorithm for the problem which requires O( n 1/3) time for each range sum query and O( n d/3 ) time for each update. Our algorithm improves the update time by a factor of O( n d/6 ) in contrast to the current one for the problem O( n d/2 ). Like all existing techniques, our approach to answering range sum queries is also based on some precomputed auxiliary information (prefix sums) that is used to answer ad hoc queries at run time. Under both the product model and a new model introduced in this paper, the total cost for updates and range queries of the proposed algorithm is smallest compared with the cost by all known algorithms. Therefore our algorithm reduces the overall time complexity for range sum queries significantly.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call