Abstract

The use of materialized views in a data warehouse installation is a common tool to speed up mostly aggregation queries. The problems coming along with materialized aggregate views have triggered a huge variety of proposals, such as picking the optimal set of aggregation combinations, transparently rewriting user queries to take advantage of the summary data, or synchronizing pre-computed summary data as soon as the base data changes. The paper focuses on the problem of view selection in the context of distributed data warehouse architectures. While much research was done with regard to the view selection problem in the central case, we are not aware to any other work discussing the problem of view selection in distributed data warehouse systems. The paper proposes an extension of the concept of an aggregation lattice to capture the distributed semantics. Moreover, we extend a greedy-based selection algorithm based on an adequate cost model for the distributed case. Within a performance study, we finally compare our findings with the approach of applying a selection algorithm locally to each node in a distributed warehouse environment.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.