Architecture- and workload- aware heterogeneous algorithms for sparse matrix vector multiplication

Sivaramakrishna Bharadwaj Indarapu,Manoj Maramreddy,Kishore Kothapalli

doi:10.1145/2675744.2675749

Sivaramakrishna Bharadwaj Indarapu, Manoj Maramreddy + Show 1 more

https://doi.org/10.1145/2675744.2675749

Copy DOI

Abstract

Multiplying a sparse matrix with a vector, denoted spmv, is a fundamental operation in linear algebra with several applications. Hence, efficient and scalable implementation of spmv has been a topic of immense research. Recent efforts are aimed at implementations on GPUs, multicore architectures, and such emerging computational platforms. Owing to the highly irregular nature of spmv, it is observed that GPUs and CPUs can offer comparable performance.In this paper, we propose three heterogeneous algorithms for spmv that simultaneously utilize both the CPU and the GPU. This is shown to lead to better resource utilization apart from performance gains. Our experiments of the work division schemes on standard datasets indicate that it is not in general possible to choose the most appropriate scheme given a matrix. We therefore consider a class of sparse matrices that exhibit a scale-free nature and identify a scheme that works well for such matrices. Finally, we use simple and effective mechanisms to determine the appropriate amount of work to be alloted to the CPU and the GPU.

Full Text