Abstract

Boolean Matrix Factorization (BMF) is a commonly used technique in the field of unsupervised data analytics. The goal is to decompose a ground truth matrix C into a product of two matrices A and $B$ being either an exact or approximate rank k factorization of C. Both exact and approximate factorization are time-consuming tasks due to their combinatorial complexity. In this paper, we introduce a massively parallel implementation of BMF - namely cuBool - in order to significantly speed up factorization of huge Boolean matrices. Our approach is based on alternately adjusting rows and columns of A and B using thousands of lightweight CUDA threads. The massively parallel manipulation of entries enables full usage of all available cores on modern CUDA-enabled GPUs. Additionally, modelling up to 32 consecutive entries of the Boolean matrices A, Band C as 32-bit integer results in fewer data accesses and faster computation of inner products. This bit-parallel approach allows for a significant decrease of memory requirements in contrast to gradient-based continuous updates of entries on dense representations. cuBool is compared to other state-of-the-art matrix factorization algorithms. Experiments on a number of real-world data sets show highly competitive results at only a fraction of computation time. cuBool proves to be a good compromise between low run time and a high-quality factorization for the decomposition of large-scale Boolean matrices. It can freely be accessed under https://github.com/funatiq/cubool.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.