Abstract

The CANDECOMP/PARAFAC (CP) tensor decomposition is a popular dimensionality-reduction method for multiway data. Dimensionality reduction is often sought after since many high-dimensional tensors have low intrinsic rank relative to the dimension of the ambient measurement space. However, the emergence of ‘big data’ poses significant computational challenges for computing this fundamental tensor decomposition. By leveraging modern randomized algorithms, we demonstrate that coherent structures can be learned from a smaller representation of the tensor in a fraction of the time. Thus, this simple but powerful algorithm enables one to compute the approximate CP decomposition even for massive tensors. The approximation error can thereby be controlled via oversampling and the computation of power iterations. In addition to theoretical results, several empirical results demonstrate the performance of the proposed algorithm.

Highlights

  • IntroductionAdvances in data acquisition and storage technology have enabled the acquisition of massive amounts of data in a wide range of emerging applications

  • Traditionally employed matrix decompositions techniques such as the singular value decomposition (SVD) and principal component analysis (PCA) can become inadequate when dealing with multidimensional data

  • The randomized CP algorithm is evaluated on a number of examples where the near optimal approximation of massive tensors can be achieved in a fraction of the time using the randomized algorithm

Read more

Summary

Introduction

Advances in data acquisition and storage technology have enabled the acquisition of massive amounts of data in a wide range of emerging applications. Numerous applications across the physical, biological, social and engineering sciences generate large multidimensional, multi-relational and/or multi-modal data Efficient analysis of this data requires dimensionality reduction techniques. Traditionally employed matrix decompositions techniques such as the singular value decomposition (SVD) and principal component analysis (PCA) can become inadequate when dealing with multidimensional data. This is because reshaping multi-modal data into matrices, or data flattening, can fail to reveal important structures in the data. The method of parallel factors (PARAFAC ) was introduced in chemometrics by Harshman (1970) This method became known as the CP (CANDECOMP/PARAFAC ) decomposition. Tensor decompositions enjoy increasing popularity, yet runtime bottlenecks persist

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.