Abstract

Sparse Matrix-Vector Multiplication (SpMV) is important in scientific and industrial applications and remains a well-known challenge for modern CPUs due to high sparsity and irregularity. Many researchers try to improve SpMV performance by designing dedicated data formats and computation patterns. However, out-of-order superscalar CPUs have complex micro-architectures where exist complicated interactions and restrictions among software and hardware factors. It is hard to systematically study the effectiveness of optimization methods on the overall performance, as its benefits may be undermined by other factors. In this paper, we thoroughly study the execution of SpMV on modern CPUs and propose a comprehensive performance model to reveal the critical factors and their relationships. Specifically, we first study the coding characteristics of SpMV kernels to identify key factors worthy of attention. Then we model the execution of SpMV as two overlapped parts: CPU pipeline and memory latency. Both are carefully modeled with related hardware and software factors. We also model SIMD performance with the usage of specific SIMD instructions and vector registers. Experiments show that our model matches the actual execution of real-world processors. Guided by the model, we propose SpV8, a novel SpMV kernel that optimizes critical factors to improve computation efficiency and memory bandwidth. Experiments on Intel/AMD x86 and ARM AArch64 platforms show that SpV8 outperforms several state-of-the-art approaches with large margins, achieving average <inline-formula><tex-math notation="LaTeX">$3.4\times$</tex-math></inline-formula> over Intel Math Kernel Library and <inline-formula><tex-math notation="LaTeX">$1.4\times$</tex-math></inline-formula> over the best existing approach. Such results indicate that the proposed model is capable of valuable guidance for efficient SpMV optimizations.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call