A Recursive Algebraic Coloring Technique for Hardware-efficient Symmetric Sparse Matrix-vector Multiplication

Christie Alappat,Alan R Bishop,Holger Fehske,Achim Basermann,Gerhard Wellein,Olaf Schenk,Georg Hager,Jonas Thies

doi:10.1145/3399732

Abstract

The symmetric sparse matrix-vector multiplication (SymmSpMV) is an important building block for many numerical linear algebra kernel operations or graph traversal applications. Parallelizing SymmSpMV on today’s multicore platforms with up to 100 cores is difficult due to the need to manage conflicting updates on the result vector. Coloring approaches can be used to solve this problem without data duplication, but existing coloring algorithms do not take load balancing and deep memory hierarchies into account, hampering scalability and full-chip performance. In this work, we propose the recursive algebraic coloring engine (RACE), a novel coloring algorithm and open-source library implementation that eliminates the shortcomings of previous coloring methods in terms of hardware efficiency and parallelization overhead. We describe the level construction, distance- k coloring, and load balancing steps in RACE, use it to parallelize SymmSpMV, and compare its performance on 31 sparse matrices with other state-of-the-art coloring techniques and Intel MKL on two modern multicore processors. RACE outperforms all other approaches substantially. By means of a parameterized roofline model, we analyze the SymmSpMV performance in detail and discuss outliers. While we focus on SymmSpMV in this article, our algorithm and software are applicable to any sparse matrix operation with data dependencies that can be resolved by distance-k coloring.

Highlights

AND RELATED WORKThe efficient solution of linear systems or eigenvalue problems involving large sparse matrices has been an active research field in parallel and high-performance computing for many decades
While we focus on SymmSpMV in this article, our algorithm and software are applicable to any sparse matrix operation with data dependencies that can be resolved by distance-k coloring
Before we evaluate the performance across the full set of matrices presented in Table 2, we return to the analysis of the SymmSpMV performance and data traffic for the Spin-26 matrix that we have presented in Section 3.3 for the established coloring approaches

Summary

Introduction

The efficient solution of linear systems or eigenvalue problems involving large sparse matrices has been an active research field in parallel and high-performance computing for many decades. The solvers are typically based on iterative subspace methods and may include advanced preconditioning techniques. Two components, sparse matrix-vector multiplication (SpMV) and coloring techniques, are crucial for hardware efficiency and parallel scalability. These two components are considered to be orthogonal, i.e., hardware efficiency for SpMV is mainly related to data formats and local structures, while coloring is used to address dependencies in the enclosing iteration scheme. The hardware-efficient parallelization of symmetric SpMV requires to handle both these aspects efficiently

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: ACM Transactions on Parallel Computing	Publication Date: Jun 29, 2020
Citations: 71	License type: cc-by

R Discovery Prime

R Discovery Prime

A Recursive Algebraic Coloring Technique for Hardware-efficient Symmetric Sparse Matrix-vector Multiplication

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: ACM Transactions on Parallel Computing

Lead the way for us

Similar Papers

ALBUS: A method for efficiently processing SpMV using SIMD and Load balancing
Haodong Bian ... Xiaoying Wang
Future Generation Computer Systems | VOL. 116
Haodong Bian, et. al.Haodong Bian ... Xiaoying Wang
04 Nov 2020
Future Generation Computer Systems | VOL. 116

Parameter-invariant models for load balancing on heterogeneous networks
Chenggui Zhao
Linear Algebra and Its Applications | VOL. 471
Chenggui ZhaoChenggui Zhao
27 Jan 2015
Linear Algebra and Its Applications | VOL. 471

An efficient halo approach for Euler-Lagrange simulations based on MPI-3 shared memory
Patrick Kopper ... Marcel Pfeiffer
-
Patrick Kopper, et. al.Patrick Kopper ... Marcel Pfeiffer
20 Jan 2021
20 Jan 2021

Accuracy and Dynamics of Multi-Stage Load Balancing for Multipath Internet Routing
R Martin ... M Hemmkeppler
-
R Martin, et. al.R Martin ... M Hemmkeppler
01 Jun 2007
01 Jun 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Recursive Algebraic Coloring Technique for Hardware-efficient Symmetric Sparse Matrix-vector Multiplication

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: ACM Transactions on Parallel Computing