CSTF

Zachary Blanco,Bangtian Liu,Maryam Mehri Dehnavi

doi:10.1145/3225058.3225133

Abstract

Tensors, or N-dimensional arrays, are increasingly used to represent multi-dimensional data. Sparse tensor decomposition algorithms are of particular interest in analyzing and compressing big datasets due to the fact that most of real-world data is sparse and multi-dimensional. However, state-of-the-art tensor decomposition algorithms are not scalable for overwhelmingly large and higher-order sparse tensors on distributed platforms. In this paper, we use the MapReduce model and the Spark engine to implement tensor factorizations on distributed platforms. The proposed CSTF algorithm, Cloud-based Sparse Tensor Factorization, is a scalable distributed algorithm for tensor decompositions for large data. It uses the coordinate storage format (COO) to operate on the tensor nonzeros directly, thus, eliminating the need for tensor unfolding and the storage of intermediate data. Also, a novel queuing strategy (QCOO) is proposed to exploit the dependency and data reuse between a sequence of tensor operations in tensor decomposition algorithms. Details on the key-value storage paradigm and Spark features used to implement the algorithm and the data reuse strategies are also provided. The queuing strategy reduces data communication costs by 35% for 3rd-order tensors and by 31% for 4th-order tensors over the COO-based implementation respectively. Compared to the state-of-the-art work, BIGtensor, CSTF achieves 2.2× to 6.9× speedup for 3rd-order tensor decompositions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

CSTF

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Efficient and scalable computations with sparse tensors
Muthu Baskaran ... Nicolas Vasilache
-
Muthu Baskaran, et. al.Muthu Baskaran ... Nicolas Vasilache
01 Sep 2012
01 Sep 2012

Multi-Aspect Incremental Tensor Decomposition Based on Distributed In-Memory Big Data Systems
Hye-Kyung Yang ... Hwan-Seung Yong
Journal of Data and Information Science | VOL. 5
Hye-Kyung Yang, et. al.Hye-Kyung Yang ... Hwan-Seung Yong
01 Apr 2020
Journal of Data and Information Science | VOL. 5

Scalable sparse tensor decompositions in distributed memory systems
Oguz Kaya ... Bora Uçar
-
Oguz Kaya, et. al.Oguz Kaya ... Bora Uçar
15 Nov 2015
15 Nov 2015

Large-scale Sparse Tensor Decomposition Using a Damped Gauss-Newton Method
Teresa M Ranadive ... Muthu M Baskaran
-
Teresa M Ranadive, et. al.Teresa M Ranadive ... Muthu M Baskaran
22 Sep 2020
22 Sep 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CSTF

Abstract

Talk to us

Similar Papers