CUSNTF

Hao Li,Kenli Li,Jiyao An,Keqin Li

doi:10.1145/3269206.3271749

Abstract

Given a high-order, large-scale and sparse data from big data and industrial applications, how can we acquire useful patterns in a real-time and low memory overhead manner? Sparse Non-negative tensor factorization (SNTF) possesses high-order representation, non-negativity and dimension reduction inherence. Thus, SNTF has become a useful tool to represent and analyze the sparse data, which has been incorporated with extra contextual information, i.e., time and location, etc, more than the matrix, which can only model the 2 ways data. However, current SNTF techniques suffer from a) non-linear time and space overhead, b) intermediate data explosion, and c) inability on GPU and multi-GPU. To address these issues, a single-thread-based SNTF is proposed, which involves the feature elements rather than on the whole factor matrices, and can avoid the forming of large-scale intermediate matrices. Then, a CUDA parallelizing single-thread-based SNTF (CUSNTF) model is proposed for industrial applications on GPU and multi-GPU (MCUSNTF). Thus, CUSNTF has linear computing and space complexity, and linear communication cost on multi-GPU. We implement CUSNTF and MCUSNTF on 8 P100 GPUs, and compare it with state-of-the-art parallel and distributed methods. Experimental results from several industrial datasets demonstrate that the linear scalability and efficiency of CUSNTF.

Full Text