Abstract
For data analysis, a partial singular value decomposition (SVD) of the sparse matrix representing the data is a powerful tool. However, computing the SVD of a large matrix can take a significant amount of time even on a current high-performance supercomputer. Hence, there is a growing interest in a novel algorithm that can quickly compute the SVD for efficiently processing massive amounts of data that are being generated from many modern applications. To respond to this demand, in this paper, we study randomized algorithms that update the SVD as changes are made to the data, which is often more efficient than recomputing the SVD from scratch. Furthermore, in some applications, recomputing the SVD may not be possible because the original data, for which the SVD has been already computed, is no longer available. Our experimental results with the data sets for the Latent Semantic Indexing and population clustering demonstrate that these randomized algorithms can obtain the desired accuracy of the SVD with a small number of data accesses, and compared to the state-of-the-art updating algorithm, they often require much lower computational and communication costs. Our performance results on a hybrid CPU/GPU cluster show that these randomized algorithms can obtain significant speedups over the state-of-the-art updating algorithm.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.