Abstract

Multiplying a sparse matrix with a vector, denoted spmv, is a fundamental operation in linear algebra with several applications. Hence, efficient and scalable implementation of spmv has been a topic of immense research. Recent efforts are aimed at implementations on GPUs, multicore architectures, and such emerging computational platforms. Owing to the highly irregular nature of spmv, it is observed that GPUs and CPUs can offer comparable performance.In this paper, we propose three heterogeneous algorithms for spmv that simultaneously utilize both the CPU and the GPU. This is shown to lead to better resource utilization apart from performance gains. Our experiments of the work division schemes on standard datasets indicate that it is not in general possible to choose the most appropriate scheme given a matrix. We therefore consider a class of sparse matrices that exhibit a scale-free nature and identify a scheme that works well for such matrices. Finally, we use simple and effective mechanisms to determine the appropriate amount of work to be alloted to the CPU and the GPU.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call