Performance Aspects of Sparse Matrix-Vector Multiplication

I Šimeček

doi:10.14311/826

Abstract

Sparse matrix-vector multiplication (shortly SpM×V) is an important building block in algorithms solving sparse systems of linear equations, e.g., FEM. Due to matrix sparsity, the memory access patterns are irregular and utilization of the cache can suffer from low spatial or temporal locality. Approaches to improve the performance of SpM×V are based on matrix reordering and register blocking [1, 2], sometimes combined with software-pipelining [3]. Due to its overhead, register blocking achieves good speedups only for a large number of executions of SpM×V with the same matrix A.We have investigated the impact of two simple SW transformation techniques (software-pipelining and loop unrolling) on the performance of SpM×V, and have compared it with several implementation modifications aimed at reducing computational and memory complexity and improving the spatial locality. We investigate performance gains of these modifications on four CPU platforms.

Highlights

Sparse matrix-vector multiplication is an important building block in algorithms solving sparse systems of linear equations, e.g., FEM
The first nonzero element of row j is stored at index adr[j] in array A
The data is represented as in the Compressed sparse row (CSR) format, but the rows are sorted by length in increasing order

Summary

Šimeček

Sparse matrix-vector multiplication (shortly SpM×V) is an important building block in algorithms solving sparse systems of linear equations, e.g., FEM. The memory access patterns are irregular and utilization of the cache can suffer from low spatial or temporal locality. Approaches to improve the performance of SpM×V are based on matrix reordering and register blocking [1, 2], sometimes combined with software-pipelining [3]. Register blocking achieves good speedups only for a large number of executions of SpM×V with the same matrix A. We have investigated the impact of two simple SW transformation techniques (software-pipelining and loop unrolling) on the performance of SpM×V, and have compared it with several implementation modifications aimed at reducing computational and memory complexity and improving the spatial locality.

Storage schemes for sparse matrices

Code restructuring

Improving the performance of sparse matrix-vector multiplication

HW and SW configuration

Evaluation of the results:

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Acta Polytechnica	Publication Date: Jan 3, 2006
Citations: 7	License type: cc-by

R Discovery Prime

R Discovery Prime

Performance Aspects of Sparse Matrix-Vector Multiplication

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Acta Polytechnica

Lead the way for us

Similar Papers

Acceleration of Sparse Matrix-Vector Multiplication by Region Traversal
I Šimeček
Acta Polytechnica | VOL. 48
I ŠimečekI Šimeček
04 Jan 2008
Acta Polytechnica | VOL. 48

A New Diagonal Blocking Format and Model of Cache Behavior for Sparse Matrices
Pavel Tvrdík ... Ivan Šimeček
-
Pavel Tvrdík, et. al.Pavel Tvrdík ... Ivan Šimeček
01 Jan 2006
01 Jan 2006

Sparse Matrix-Vector Multiplication - Final Solution?
Ivan Šimeček ... Pavel Tvrdík
-
Ivan Šimeček, et. al.Ivan Šimeček ... Pavel Tvrdík
09 Sep 2007
09 Sep 2007

ReCALL: Reordered Cache Aware Locality Based Graph Processing
Kartik Lakhotia ... Rajgopal Kannan
-
Kartik Lakhotia, et. al.Kartik Lakhotia ... Rajgopal Kannan
01 Dec 2017
01 Dec 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Performance Aspects of Sparse Matrix-Vector Multiplication

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Acta Polytechnica