TpSpMV: A two-phase large-scale sparse matrix-vector multiplication kernel for manycore architectures

Yuedan Chen,Guoqing Xiao,Fan Wu,Zhuo Tang,Keqin Li

doi:10.1016/j.ins.2020.03.020

Abstract

Sparse matrix-vector multiplication (SpMV) is one of the important subroutines in numerical linear algebras widely used in lots of large-scale applications. Accelerating SpMV on multicore and manycore architectures based on Compressed Sparse Row (CSR) format via row-wise parallelization is one of the most popular directions. However, there are three main challenges in optimizing parallel CSR-based SpMV: (a) limited local memory of each computing unit can be overwhelmed by assignments to long rows of large-scale sparse matrices; (b) irregular accesses to the input vector result in expensive memory access latency; (c) sparse data structure leads to low bandwidth usage. This paper proposes a two-phase large-scale SpMV, called tpSpMV, based on the memory structure and computing architecture of multicore and manycore architectures to alleviate the three main difficulties. First, we propose the two-phase parallel execution technique for tpSpMV that performs parallel CSR-based SpMV into two separate phases to overcome the computational scale limitation. Second, we respectively propose the adaptive partitioning methods and parallelization designs using the local memory caching technique for the two phases to exploit the architectural advantages of the high-performance computing platforms and alleviate the problem of high memory access latency. Third, we design several optimizations, such as data reduction, aligned memory accessing, and pipeline technique, to improve bandwidth usage and optimize tpSpMV’s performance. Experimental results on SW26010 CPUs of the Sunway TaihuLight supercomputer prove that tpSpMV achieves up to 28.61 speedups and yields the performance improvement of 13.16% over the state-of-the-art work on average.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

TpSpMV: A two-phase large-scale sparse matrix-vector multiplication kernel for manycore architectures

Abstract

Talk to us

Similar Papers

More From: Information Sciences

Lead the way for us

Journal: Information Sciences	Publication Date: Mar 14, 2020
Citations: 12

Similar Papers

Parallel Multicore CSB Format and Its Sparse Matrix Vector Multiplication
Bing Yang ... Cong Zheng
Advances in Linear Algebra & Matrix Theory | VOL. 04
Bing Yang, et. al.Bing Yang ... Cong Zheng
01 Jan 2014
Advances in Linear Algebra & Matrix Theory | VOL. 04

Auto-Tuning Strategies for Parallelizing Sparse Matrix-Vector (SpMV) Multiplication on Multi- and Many-Core Processors
Kaixi Hou ... Shuai Che
-
Kaixi Hou, et. al.Kaixi Hou ... Shuai Che
01 May 2017
01 May 2017

Design and Implementation of Adaptive SpMV Library for Multicore and Many-Core Architecture
Guangming Tan ... Junhong Liu
ACM Transactions on Mathematical Software | VOL. 44
Guangming Tan, et. al.Guangming Tan ... Junhong Liu
09 Aug 2018
ACM Transactions on Mathematical Software | VOL. 44

GPU Implementation of Image Convolution Using Sparse Model with Efficient Storage Format
Saira Banu Jamal Mohammed ... Sumithra Sriram
International Journal of Grid and High Performance Computing | VOL. 10
Saira Banu Jamal Mohammed, et. al.Saira Banu Jamal Mohammed ... Sumithra Sriram
01 Jan 2018
International Journal of Grid and High Performance Computing | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

TpSpMV: A two-phase large-scale sparse matrix-vector multiplication kernel for manycore architectures

Abstract

Talk to us

Similar Papers

More From: Information Sciences