Inverting middleware framework

Michela Taufer ,Thomas M Stricker ,Roger Weber

doi:10.3929/ethz-a-006654456

Abstract

Clusters of commodity PCs are an attractive platform for parallel databases running large OLAP (On Line Analytical Processing) workloads. Using a high number of cluster nodes could result in a significant speed-up for processing a very large data set. Still engineering such systems for scalability and good performance remains a highly difficult task, since we still are lacking a deep understanding of the architectural issues of resource usage and scalability of OLAP applications on cost effective clusters of PCs. A considerable amount of work must be invested into better performance analysis tools that work well with paralleland distributed platforms and with standard DBMSs. While standard DBMSs help the application writer to reduce programming effort, they often cause loss of control over performance issues resulting in suboptimal usage of machine resources. To address this problem, we present a novel performance monitoring framework called inverted middleware framework (MW 1). Our approach emphasizes the aspect of reverse mapping the resource abstractions introduced by the middleware layer (e.g. the DBMS) in term of resource usage cost. Inverted middleware framework assists the process of performance engineering by mapping low level performance information, monitored at the operating system layer, back to a higher layer (i.e. the application layer) filtering from performance counter samples at the operating system level and delivering good overall performance pictures at a higher level of abstraction. The framework is used side by side with the DBMS and delivers many interesting insights about the most critical resource in each of the different queries and systems configuration. As required for a larger distributed hardware/software system, our framework comprises some software instrumentation at the OS level, tools for gathering all performance relevant data and an analytical model that can be used for performance evaluation and performance prediction to newer platforms. In this report, we demonstrate the viability of our approach with the in depth analysis of TPC-D, a standard OLAP benchmark running on clusters of commodity PCs. With the help of data provided by our performance monitoring framework, we are able to isolate and resolve a few crucial performance issues for OLAP workloads. As experts for the architecture of clusters, we intentionally limited our experimental work to two fixed distribution scheme (relations must be distributed to permit scalability to large data sets) leaving the issues of optimal data distribution and optimal use of indices to our database experts. As a result, we can give a good characterization of different OLAP workloads (i.e. the 17 queries of the TCP-D) in terms of their resource usage, quantify the optimal scalability for different queries and investigate the impact of the networking speed on the overall application performance. We can show that the disk performance and CPU speed remains the most critical resource bottleneck a majority of the queries. Queries with a lot of inter-node communication are rather limited by the communication software inefficiency within the DBMS than by the raw networking speeds. We think that our work constitutes a solid basis for future architectural decisions and system optimization in clusters of PCs that are dedicated to large parallel database systems.

Full Text