Efficient GPU implementation of randomized SVD and its applications

Łukasz Struski,Paweł Morkisz,Przemysław Spurek,Samuel Rodriguez Bernabeu,Tomasz Trzciński

doi:10.1016/j.eswa.2024.123462

Łukasz Struski, Paweł Morkisz + Show 3 more

Open Access

https://doi.org/10.1016/j.eswa.2024.123462

Copy DOI

Abstract

Matrix decompositions are ubiquitous in machine learning, including applications in dimensionality reduction, data compression and deep learning algorithms. Typical solutions for matrix decompositions have polynomial complexity which significantly increases their computational cost and time. In this work, we leverage efficient processing operations that can be run in parallel on modern Graphical Processing Units (GPUs), predominant computing architecture used e.g. in deep learning, to reduce the computational burden of computing matrix decompositions. More specifically, we reformulate the randomized decomposition problem to incorporate fast matrix multiplication operations (BLAS-3) as building blocks. We show that this formulation, combined with fast random number generators, allows to fully exploit the potential of parallel processing implemented in GPUs. Our extensive evaluation confirms the superiority of this approach over the competing methods and we release the results of this research as a part of the official CUDA implementation.11https://docs.nvidia.com/cuda/cusolver/index.html.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Efficient GPU implementation of randomized SVD and its applications

Abstract

Talk to us

Similar Papers

More From: Expert Systems With Applications

Lead the way for us

Journal: Expert Systems With Applications	Publication Date: Feb 14, 2024
Citations: 1

Similar Papers

Trident: A Hybrid Correlation-Collision GPU Cache Timing Attack for AES Key Recovery
Jaeguk Ahn ... Jiho Kim
-
Jaeguk Ahn, et. al.Jaeguk Ahn ... Jiho Kim
01 Feb 2021
01 Feb 2021

Hi-End: Hierarchical, Endurance-Aware STT-MRAM-Based Register File for Energy-Efficient GPUs
Won Jeon ... Yoonsoo Kim
IEEE Access | VOL. 8
Won Jeon, et. al.Won Jeon ... Yoonsoo Kim
01 Jan 2020
IEEE Access | VOL. 8

Reply
Aaron Lee ... Adnan Tufail
Ophthalmology | VOL. 125
Aaron Lee, et. al.Aaron Lee ... Adnan Tufail
19 Oct 2018
Ophthalmology | VOL. 125

A case for core-assisted bottleneck acceleration in GPUs
Nandita Vijaykumar ... Onur Mutlu
ACM SIGARCH Computer Architecture News | VOL. 43
Nandita Vijaykumar, et. al.Nandita Vijaykumar ... Onur Mutlu
13 Jun 2015
ACM SIGARCH Computer Architecture News | VOL. 43

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficient GPU implementation of randomized SVD and its applications

Abstract

Talk to us

Similar Papers

More From: Expert Systems With Applications