Abstract

The sparse matrix vector multiplication (SpMV) is inevitable in almost all kinds of scientific computation, such as iterative methods for solving linear systems and eigenvalue problems. With the emergence and development of Graphics Processing Units (GPUs), high efficient formats for SpMV should be constructed. The performance of SpMV is mainly determinted by the storage format for sparse matrix. Based on the idea of JAD format, this paper improved the ELLPACK-R format, reduced the waiting time between different threads in a warp, and the speed up achieved about 1.5 in our experimental results. Compared with other formats, such as CSR, ELL, BiELL and so on, our format performance of SpMV is optimal over 70 percent of the test matrix. We proposed a method based on parameters to analyze the performance impact on different formats. In addition, a formula was constructed to count the computation and the number of iterations.

Highlights

  • The Sparse matrix vector multiplication (SpMV) is a key operation in for a variety of computation science, such as in many iterative methods for solving linear systems ( Ax = b ), image processing, simulation and so on

  • There are many storage formats related to sparse matrix, such as compressed sparse row (CSR), ELL, hybrid format (HYB), BiELL and so on

  • The calculation of sparse matrix vector multiplication (SpMV) based on COO format is not suitable for Graphics Processing Units (GPUs) structure when the matrix is stored in disorder

Read more

Summary

Introduction

GPU including many Stream Processors, and many threads can simultaneously calculate multiple groups of data, with high computational power and very high memory bandwidth. In order to improve the computational efficiency, it is important to make changes to find a suitable matrix storage format and calculation method. There are many storage formats related to sparse matrix, such as CSR, ELL, HYB, BiELL and so on. In [5] we can see ELL performance for the structured matrices because it has continuous access to memory. The ELLPACK-R format presented in [6] is optimized to reduce the waiting time between different threads.

Basic Formats to Sparse Matrices
COO Format
CSR Format
ELL-Like Formats
Our New Format
Numerical Result
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call