Abstract

SpMV is the core algorithm in solving the sparse linear equations, which is widely used in many research and engineering application field. GPU is the most common coprocessor in high-performance computing domain, and has already been proven to researchers the practical value in accelerating various algorithms. A lot of reletead work has been carried out to optimize parallel SpMV on CPU-GPU platforms, which mainly focuses on reducing the computing overhead on the GPU, including branch divergence and cache missing, and little attention was paid to the overall efficiency of the heterogeneous platform. In this paper, we describe the design and implementation of an adaptive sparse matrix-vector multiplication (SpMV) on CPU-GPU heterogeneous architecture. We propose a dynamic task scheduling framework for CPU-GPU platform to improve the utilization of both CPU and GPU. A double buffering scheme is also presented to hide the data transfer overhead between CPU and GPU. Two deeply optimized SpMV kernels are deployed for CPU and GPU respectively. The evaluation on typical sparse matrices indicates that the proposed algorithm obtains both significant performance increase and adaptability to different types of sparse matrices.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call