Performance Analysis of Sparse Matrix-Vector Multiplication (SpMV) on Graphics Processing Units (GPUs)

Sarah Alahmadi,Iyad Katib,Thaha Mohammed,Aiiad Albeshri,Rashid Mehmood

doi:10.3390/electronics9101675

Sarah Alahmadi, Iyad Katib + Show 3 more

Open Access

https://doi.org/10.3390/electronics9101675

Copy DOI

Abstract

Graphics processing units (GPUs) have delivered a remarkable performance for a variety of high performance computing (HPC) applications through massive parallelism. One such application is sparse matrix-vector (SpMV) computations, which is central to many scientific, engineering, and other applications including machine learning. No single SpMV storage or computation scheme provides consistent and sufficiently high performance for all matrices due to their varying sparsity patterns. An extensive literature review reveals that the performance of SpMV techniques on GPUs has not been studied in sufficient detail. In this paper, we provide a detailed performance analysis of SpMV performance on GPUs using four notable sparse matrix storage schemes (compressed sparse row (CSR), ELLAPCK (ELL), hybrid ELL/COO (HYB), and compressed sparse row 5 (CSR5)), five performance metrics (execution time, giga floating point operations per second (GFLOPS), achieved occupancy, instructions per warp, and warp execution efficiency), five matrix sparsity features (nnz, anpr, nprvariance, maxnpr, and distavg), and 17 sparse matrices from 10 application domains (chemical simulations, computational fluid dynamics (CFD), electromagnetics, linear programming, economics, etc.). Subsequently, based on the deeper insights gained through the detailed performance analysis, we propose a technique called the heterogeneous CPU–GPU Hybrid (HCGHYB) scheme. It utilizes both the CPU and GPU in parallel and provides better performance over the HYB format by an average speedup of 1.7x. Heterogeneous computing is an important direction for SpMV and other application areas. Moreover, to the best of our knowledge, this is the first work where the SpMV performance on GPUs has been discussed in such depth. We believe that this work on SpMV performance analysis and the heterogeneous scheme will open up many new directions and improvements for the SpMV computing field in the future.

Highlights

Sparse matrix-vector multiplication (SpMV) is fundamental to many scientific, engineering, and other applications [1,2,3,4,5,6,7,8,9]
We have done an extensive review of SpMV techniques on Graphics processing units (GPUs) and believe that the performance of SpMV techniques on GPUs has not been studied in sufficient detail; see for instance the most recent review of SpMV computations on GPUs [55]
We believe that this work on SpMV performance analysis and the heterogeneous scheme will open up many new directions and improvements for the SpMV computing field in the future

Summary

Introduction

Sparse matrix-vector multiplication (SpMV) is fundamental to many scientific, engineering, and other applications [1,2,3,4,5,6,7,8,9]. This line of research broadly aims to analyze and improve the performance of SpMV computations Towards this aim, we have developed a range of techniques over the years to propose novel storage schemes and algorithms on CPU [66,67,68,69,70,71,72], MIC [73,74,75], and GPU architectures [76,77].

Dataset and Sparsity Features

Performance Metrics

Execution Time

GPU Throughput

GPU Utilization

Motivation and Description

HCGHYB

Findings

Conclusions and Future Work

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronics	Publication Date: Oct 13, 2020
Citations: 10	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Performance Analysis of Sparse Matrix-Vector Multiplication (SpMV) on Graphics Processing Units (GPUs)

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

A Compression Method for Storage Formats of a Sparse Matrix in Solving the Large-Scale Linear Systems
Yoneda Kazunori ... Takashi Iwamura
-
Yoneda Kazunori, et. al.Yoneda Kazunori ... Takashi Iwamura
01 May 2017
01 May 2017

A new sparse matrix vector multiplication graphics processing unit algorithm designed for finite element problems
E Kuhl ... E Darve
International Journal for Numerical Methods in Engineering | VOL. 102
E Kuhl, et. al.E Kuhl ... E Darve
09 Jan 2015
International Journal for Numerical Methods in Engineering | VOL. 102

Embedded multicore computing and applications
Jia Hu ... Frédéric Magoulès
Concurrency and Computation: Practice and Experience | VOL. 28
Jia Hu, et. al.Jia Hu ... Frédéric Magoulès
02 Aug 2016
Concurrency and Computation: Practice and Experience | VOL. 28

Improving performance of iterative solvers with the AXC format using the Intel Xeon Phi
Edoardo Coronado-Barrientos ... Guillermo Indalecio
The Journal of Supercomputing | VOL. 74
Edoardo Coronado-Barrientos, et. al.Edoardo Coronado-Barrientos ... Guillermo Indalecio
24 Mar 2018
The Journal of Supercomputing | VOL. 74

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Performance Analysis of Sparse Matrix-Vector Multiplication (SpMV) on Graphics Processing Units (GPUs)

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics