Optimizing Random Access to Hierarchically-Compressed Data on GPU

Xiaoyong Du,Xiao Zhang,Haipeng Ding,Yihua Hu,Zhiming Yao,Zhewei Wei,Feng Zhang

doi:10.1109/sc41404.2022.00023

Abstract

GPU's powerful computational capacity holds great potentials for processing hierarchically-compressed data without decompression in data science domain. Unfortunately, existing GPU approaches offer only traversal-based data analytics; random access is extremely inefficient, substantially limiting their utility. To solve this problem, we develop a novel and broadly applicable optimization that enables efficient random access to hierarchically-compressed data without decompression in GPU memory. We address three major challenges for enabling efficient random access to compressed data on GPUs. The first challenge is designing GPU data structures that support random access. The second challenge is efficiently generating data structures on GPU. Generating data structures for random access is costly on the CPU, and the inefficiency increases dramatically when PCIe data transmission is incorporated. The third challenge is query processing on compressed data in GPU memory. Random accesses, including data updates, result in significant conflicts between massive threads. To solve the first challenge, we propose and modify a number of compressed data structures, including indexing within the complicated GPU memory hierarchy. To address the second challenge, we develop a two-phase process for generating these data structures on the GPU. To handle the third challenge, we propose a double-parsing design to avoid data conflicts. We evaluate our solution on two GPU platforms using five real-world datasets. Experiments show that the random access operations on GPU can achieve 65.04x average speedup compared to the state-of-the-art method.

Full Text