Abstract

Sparse matrix–vector multiplication (SpMV) is an important issue in scientific computing and engineering applications. The performance of SpMV can be improved using parallel computing. The implementation and optimization of SpMV on GPU are research hotspots. Due to some irregularities of sparse matrices, the use of a single compression format is not satisfactory. The hybrid storage format can expand the range of adaptation of the compression algorithms. However, because of the imbalance of non-zero elements, the parallel computing capability of a GPU cannot be fully utilized. The parallel computing capability of a CPU is also rising due to increased number of cores in CPU. However, when a GPU is computing, the CPU controls the process instead of contributing to the computational work. It leads to under-utilization of the computing power of CPU. Due to the characteristics of the sparse matrices, the data can be split into two parts using the hybrid storage format to be allocated to CPU and GPU for simultaneous computing. In order to take full advantage of computing resources of CPU and GPU, the CPU–GPU heterogeneous computing model is adopted in this paper to improve the performance of SpMV. With analysis of the characteristics of CPU and GPU, an optimization strategy of sparse matrix partitioning using a distribution function is proposed to improve the computing performance of SpMV on the heterogeneous computing platform. The experimental results on two test machines demonstrate noticeable performance improvement.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call