Acceleration of Sparse Matrix-Vector Multiplication by Region Traversal

I Šimeček

doi:10.14311/1029

Abstract

Sparse matrix-vector multiplication (shortly SpM×V) is one of most common subroutines in numerical linear algebra. The problem is that the memory access patterns during SpM×V are irregular, and utilization of the cache can suffer from low spatial or temporal locality. Approaches to improve the performance of SpM×V are based on matrix reordering and register blocking. These matrix transformations are designed to handle randomly occurring dense blocks in a sparse matrix. The efficiency of these transformations depends strongly on the presence of suitable blocks. The overhead of reorganization of a matrix from one format to another is often of the order of tens of executions ofSpM×V. For this reason, such a reorganization pays off only if the same matrix A is multiplied by multiple different vectors, e.g., in iterative linear solvers.This paper introduces an unusual approach to accelerate SpM×V. This approach can be combined with other acceleration approaches andconsists of three steps:1) dividing matrix A into non-empty regions,2) choosing an efficient way to traverse these regions (in other words, choosing an efficient ordering of partial multiplications),3) choosing the optimal type of storage for each region.All these three steps are tightly coupled. The first step divides the whole matrix into smaller parts (regions) that can fit in the cache. The second step improves the locality during multiplication due to better utilization of distant references. The last step maximizes the machine computation performance of the partial multiplication for each region.In this paper, we describe aspects of these 3 steps in more detail (including fast and time-inexpensive algorithms for all steps). Ourmeasurements prove that our approach gives a significant speedup for almost all matrices arising from various technical areas.

Highlights

There are several formats for storing sparse matrices
We define the accuracy of cache behavior simulation algorithm (CBSA) as the ratio between the number of cache misses predicted by CBSA and the number of cache misses measured by the SW cache analyzer
We have presented an unusual approach to accelerate sparse matrix-vector multiplication

Summary

Šimeček

Sparse matrix-vector multiplication (shortly SpM×V) is one of most common subroutines in numerical linear algebra. Approaches to improve the performance of SpM×V are based on matrix reordering and register blocking These matrix transformations are designed to handle randomly occurring dense blocks in a sparse matrix. This paper introduces an unusual approach to accelerate SpM×V This approach can be combined with other acceleration approaches and consists of three steps: 1) dividing matrix A into non-empty regions, 2) choosing an efficient way to traverse these regions (in other words, choosing an efficient ordering of partial multiplications), 3) choosing the optimal type of storage for each region. All these three steps are tightly coupled.

Introduction

The cache model

Common sparse matrix formats

Usual approach

Our approach

Dividing the matrix into regions

Choosing a suitable storage format for the regions

Choosing a good traversal of regions

Evaluation of the results

Test data

Accuracy of CBSA

Performance

Speedup

Payoff iterations

Impact of the traversal on the performance

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Acceleration of Sparse Matrix-Vector Multiplication by Region Traversal

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Acta Polytechnica

Lead the way for us

Journal: Acta Polytechnica	Publication Date: Jan 4, 2008
License type: cc-by

Similar Papers

A New Approach for Accelerating the Sparse Matrix-Vector Multiplication
Pavel Tvrdik ... Ivan Simecek
-
Pavel Tvrdik, et. al.Pavel Tvrdik ... Ivan Simecek
01 Sep 2006
01 Sep 2006

Performance Aspects of Sparse Matrix-Vector Multiplication
I Šimeček
Acta Polytechnica | VOL. 46
I ŠimečekI Šimeček
03 Jan 2006
Acta Polytechnica | VOL. 46

A New Diagonal Blocking Format and Model of Cache Behavior for Sparse Matrices
Pavel Tvrdík ...
-
Pavel Tvrdík, et. al.Pavel Tvrdík ...
01 Jan 2006
01 Jan 2006

Sparsity: Optimization Framework for Sparse Matrix Kernels
Eun-Jin Im ... Katherine Yelick
The International Journal of High Performance Computing Applications | VOL. 18
Eun-Jin Im, et. al.Eun-Jin Im ... Katherine Yelick
01 Feb 2004
The International Journal of High Performance Computing Applications | VOL. 18

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Acceleration of Sparse Matrix-Vector Multiplication by Region Traversal

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Acta Polytechnica