Straight‐line programs for fast sparse matrix‐vector multiplication

Samuel Neves,Filipe Araujo

doi:10.1002/cpe.3211

Samuel Neves, Filipe Araujo

https://doi.org/10.1002/cpe.3211

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

SummarySparse matrix‐vector multiplication dominates the performance of many scientific and industrial problems. For example, iterative methods for solving linear systems often rely on the performance of this critical operation. The particular case of binary matrices shows up in several important areas of computing, such as graph theory and cryptography. Unfortunately, irregular memory access patterns cause poor memory throughput, slowing down this operation. To maximize memory throughput, we translate the matrix into a straight‐line program that takes advantage of the CPU's instruction cache and hardware prefetchers. The regular loopless pattern of the program reduces cache misses, thus decreasing the latency for most instructions. We focus on the widely used x86_64 architecture and on binary matrices, to explore several possible tradeoffs regarding memory access policies and code size. We also consider matrices with elements over various mathematical structures, such as floating‐point reals and integers modulo m. When compared to a Compressed Row Storage implementation, we obtain significant speedups. Copyright © 2014 John Wiley & Sons, Ltd.

Full Text