Abstract

CANDECOMP/PARAFAC (CP) tensor decomposition is a popular method for detecting latent behaviors in real–world data sets. As data sets grow larger and more elaborate, more sophisticated CP decomposition algorithms are required to enable these discoveries. Data sets from many applications can be represented as count tensors. To decompose count tensors, one should minimize the sum of the generalized Kullback–Leibler divergences from each tensor entry to each corresponding decomposition entry. Most often, this is done using the algorithm CP–APR (CP–Alternating Poisson Regression). In view of the fact that all-at-once optimization algorithms for related CP decomposition problems often achieve better decomposition accuracy than alternating algorithms like CP–APR, we develop CP–POPT–GDGN, an all–at–once algorithm for count tensor decomposition that utilizes a generalized damped Gauss-Newton method. We then implement a highly efficient version of CP–POPT–GDGN into the tensor package ENSIGN. After decomposing several tensors formed from real-world data sets, many of which are related to network traffic, we see that CP–POPT–GDGN typically outperforms CP–APR, both in terms of decomposition accuracy and latent behavior detection.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.