CuBool: Bit-Parallel Boolean Matrix Factorization on CUDA-Enabled Accelerators

Robin Kobus,Andre Muller,Bertil Schmidt,Adrian Lamoth,Stefan Kramer,Christian Hundt

doi:10.1109/padsw.2018.8644574

Abstract

Boolean Matrix Factorization (BMF) is a commonly used technique in the field of unsupervised data analytics. The goal is to decompose a ground truth matrix C into a product of two matrices A and $B$ being either an exact or approximate rank k factorization of C. Both exact and approximate factorization are time-consuming tasks due to their combinatorial complexity. In this paper, we introduce a massively parallel implementation of BMF - namely cuBool - in order to significantly speed up factorization of huge Boolean matrices. Our approach is based on alternately adjusting rows and columns of A and B using thousands of lightweight CUDA threads. The massively parallel manipulation of entries enables full usage of all available cores on modern CUDA-enabled GPUs. Additionally, modelling up to 32 consecutive entries of the Boolean matrices A, Band C as 32-bit integer results in fewer data accesses and faster computation of inner products. This bit-parallel approach allows for a significant decrease of memory requirements in contrast to gradient-based continuous updates of entries on dense representations. cuBool is compared to other state-of-the-art matrix factorization algorithms. Experiments on a number of real-world data sets show highly competitive results at only a fraction of computation time. cuBool proves to be a good compromise between low run time and a high-quality factorization for the decomposition of large-scale Boolean matrices. It can freely be accessed under https://github.com/funatiq/cubool.

Full Text