Abstract

Real-time warehousing has been researched and tried in practice by several companies, but there should be a definition of an architecture and an evaluation of its merits. Based on both industry and theoretical insights, we define a data warehouse architecture for constant integration of data without compromising query performance, and we evaluate its capacity to provide realtime. For this real-time data warehouse RTDW we define a dynamic warehouse component and a static warehouse component to represent the recently integrated data and the rest of the data, respectively, with relevant choices concerning how the components merge together. We propose design choices concerning query computation mechanisms and evaluate the alternatives to conclude which is the most efficient implementation of those mechanisms. The real-time data warehouse architecture is evaluated with a real-time benchmark setup that considers online loading with simultaneous query workload processing. Results prove the validity of the architecture, compare different mechanisms and quantify its efficiency in the constant integration context. The performance is carefully analysed taking into account several important factors.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call