Abstract

One of the most powerful and prominent technologies for knowledge discovery in decision support systems is online analytical processing (OLAP). Most of the traditional OLAP research, and most of the commercial systems, follow the static data cube approach proposed by Gray et.al. and materialize all or a subset of the cuboids of the data cube in order to ensure adequate query performance. Practitioners have called for some time for a real-time OLAP approach where the OLAP system gets updated instantaneously as new data arrives and always provides an up-to-date data warehouse for the decision support process. However, a major problem for real-time OLAP is the significant performance issues with large scale data warehouses. The aim of our research is to address these problems through the use of efficient parallel computing methods. In this paper, we present a parallel real-time OLAP system for multi-core processors. To our knowledge, this is the first real-time OLAP system that has been parallelized and optimized for contemporary multi-core architectures. Our system allows for multiple insert and multiple query transactions to be executed in parallel and in real-time. We evaluated our method for a multitude of scenarios (different ratios of insert and query transactions, query transactions with different amounts of data aggregation, different database sizes, etc.), using the TPCDS “Decision Support” benchmark data set. As multi-core test platforms, we used an Intel Sandy Bridge processor with 4 cores (8 hardware supported threads) and an Intel Xeon Westmere processor with 20 cores (40 hardware supported threads). The tests demonstrate that, with increasing number of processor cores, our parallel system achieves close to linear speedup in transaction response time and transaction throughput. On the 20 core architecture we achieved, for a 100 GB database, a better than 0.25 second query response time for real-time OLAP queries that aggregate 25% of the database. Since hardware performance improvements are currently, and in the foreseeable future, achieved not by faster processors but by increasing the number of processor cores, our new parallel real-time OLAP method has the potential to enable OLAP systems that operate in real-time on large databases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call