Tensor data is common in real-world applications, such as recommendation system and air quality monitoring. But such data is often sparse, noisy, and fast produced. CANDECOMP/PARAFAC (CP) is a popular tensor decomposition model, which is both theoretically advantageous and numerically stable. However, learning the CP model in a Bayesian framework, though promising to handle data sparsity and noise, is computationally challenging, especially with fast produced data streams. The fundamental problem addressed by the paper is mainly tackles the efficient processing of streaming tensor data. In this work, we propose BS-CP, a quick and accurate structure to dynamically update the posterior of latent factors when a new observation tensor is received. We first present the BS-CP1 algorithm, which is an efficient implementation using assumed density filtering (ADF). In addition, we propose BS-CP2 algorithm, using Gauss-Laguerre quadrature method to integrate the noise effect which shows better empirical result. We tested BS-CP1 and BS-CP2 on generic real recommendation system datasets, including Beijing-15k, Beijing-20k, MovieLens-1m and Fit Record. Compared with state-of-the-art methods, BS-CP1 achieve 31.8% and 33.3% RMSE improvement in the last two datasets, with a similar trend observed for BS-CP2. This evidence proves that our algorithm has better results on large datasets and is more suitable for real-world scenarios. Compared with most other comparison methods, our approach has demonstrated an improvement of over 10% and exhibits superior stability.
Read full abstract