Exploring Means to Enhance the Efficiency of GPU Bitmap Index Query Processing

Brandon Tran,Brennan Schaffner,David Chiu,Jason Sawin,Joseph M Myre

doi:10.1007/s41019-020-00148-8

Brandon Tran, Brennan Schaffner + Show 3 more

Open Access

https://doi.org/10.1007/s41019-020-00148-8

Copy DOI

Abstract

Once exotic, computational accelerators are now commonly available in many computing systems. Graphics processing units (GPUs) are perhaps the most frequently encountered computational accelerators. Recent work has shown that GPUs are beneficial when analyzing massive data sets. Specifically related to this study, it has been demonstrated that GPUs can significantly reduce the query processing time of database bitmap index queries. Bitmap indices are typically used for large, read-only data sets and are often compressed using some form of hybrid run-length compression. In this paper, we present three GPU algorithm enhancement strategies for executing queries of bitmap indices compressed using word aligned hybrid compression: (1) data structure reuse (2) metadata creation with various type alignment and (3) a preallocated memory pool. The data structure reuse greatly reduces the number of costly memory system calls. The use of metadata exploits the immutable nature of bitmaps to pre-calculate and store necessary intermediate processing results. This metadata reduces the number of required query-time processing steps. Preallocating a memory pool can reduce or entirely remove the overhead of memory operations during query processing. Our empirical study showed that performing a combination of these strategies can achieve 32.4times to 98.7times speedup over the current state-of-the-art implementation. Our study also showed that by using our enhancements, a common gaming GPU can achieve a 15.0times speedup over a more expensive high-end CPU.

Highlights

Modern companies rely on big data to drive their business decisions [14, 16, 31]
We explore techniques that use metadata, data structure reuse, and preallocation tailored to speed up the processing of word aligned hybrid (WAH) range queries on graphics processing units (GPUs)
We present an empirical study of our proposed enhancements to the GPU-WAH decompression algorithm applied to both real and synthetic data sets

Summary

Introduction

Modern companies rely on big data to drive their business decisions [14, 16, 31]. A prime example of the new corporate reliance on data is Starbucks, which uses big data to determine where to open stores, target customer recommendations, and menu updates [30]. The coffee company even uses weather data to adjust its digital advertisement copy [6]. To meet this need, companies are collecting astounding amounts of data. The shipping company UPS stores over 16 petabytes of data to meet their business needs [14]. Large repositories of data are only useful if they can be analyzed in a timely and efficient manner. We present techniques that take advantage of synergies between hardware and software to speed up the analysis of data

Methods

Results

Discussion

Conclusion