Abstract

Given data from a variety of sources that share a number of dimensions, how can we effectively decompose them jointly into interpretable latent factors? The coupled tensor decomposition framework captures this idea by jointly supporting the decomposition of several CP tensors. However, coupling tends to suffer when one dimension of data is irregular, i.e., one of the dimensions of the tensor is uneven, such as in the case of PARAFAC2. In this work, we provide a scalable method for decomposing coupled CP and PARAFAC2 tensor datasets through non-negativity-constrained least squares optimization on a variety of objective functions. We offer the following contributions: (1) Our algorithm can perform coupled factorization with an active-set, block principal pivoting and least square optimization method including the Frobenius norm induced non-negative factorization. (2) $C$ 3 APTION scales to billions of non-zero elements in both the data and model. Comprehensive experiments on large data confirmed that $C$ 3 APTION is up to 5× faster and 70 - 80% accurate than several baselines. We present results showing the scalability of this novel implementation on a billion elements as well as demonstrate the high level of interpretability in the latent factors produced, implying that coupling is indeed a promising framework for large-scale, unsupervised pattern exploration and cluster discovery.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call