Abstract

Tensor decomposition is often used to extract underlying features in the analysis of large and multi-dimensional data. For the tensor data with sparse characteristics, Sparse Matricized Tensor Times Khatri-Rao Product (MTTKRP) is a performance bottleneck in the Alternating Least Squares (ALS) method, which is widely used for tensor decomposition. Since MTTKRP is computed for each mode of the tensor in the ALS method, it is required to compute MTTKRP efficiently not only for a specific mode but also for all modes. In addition, it is hard to convert input tensor data into the optimal format for each mode in terms of memory usage for large-scale data. We propose a fast MTTKRP calculation algorithm with a single replica for all modes based on the widely used COO format. It is a scalable, faster, and memory-saving method that does not require exclusive control, which is necessary for the conventional MTTKRP algorithm with the COO format. As a result of performance evaluation, we have achieved a significant speed-up from the existing method. For the MTTKRP calculation, we have achieved a maximum performance improvement of x4.8 and an average performance improvement of x1.3 on Intel Xeon, and a maximum performance improvement of x9.9 and an average performance improvement of x2.3 on Fujitsu A64FX. For the ALS method, our approach on the Fujitsu A64FX achieved a performance improvement of up to x1.8 over the existing method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call