Simultaneous Input and Output Matrix Partitioning for Outer-Product--Parallel Sparse Matrix-Matrix Multiplication

Kadir Akbudak,Cevdet Aykanat

doi:10.1137/13092589x

Abstract

For outer-product--parallel sparse matrix-matrix multiplication (SpGEMM) of the form $C\!=\!A\!\times\!B$, we propose three hypergraph models that achieve simultaneous partitioning of input and output matrices without any replication of input data. All three hypergraph models perform conformable one-dimensional (1D) columnwise and 1D rowwise partitioning of the input matrices $A$ and $B$, respectively. The first hypergraph model performs two-dimensional (2D) nonzero-based partitioning of the output matrix, whereas the second and third models perform 1D rowwise and 1D columnwise partitioning of the output matrix, respectively. This partitioning scheme induces a two-phase parallel SpGEMM algorithm, where communication-free local SpGEMM computations constitute the first phase and the multiple single-node-accumulation operations on the local SpGEMM results constitute the second phase. In these models, the two partitioning constraints defined on weights of vertices encode balancing computational loads of processors during the two separate phases of the parallel SpGEMM algorithm. The partitioning objective of minimizing the cutsize defined over the cut nets encodes minimizing the total volume of communication that will occur during the second phase of the parallel SpGEMM algorithm. An MPI-based parallel SpGEMM library is developed to verify the validity of our models in practice. Parallel runs of the library for a wide range of realistic SpGEMM instances on two large-scale parallel systems JUQUEEN (an IBM BlueGene/Q system) and SuperMUC (an Intel-based cluster) show that the proposed hypergraph models attain high speedup values.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Simultaneous Input and Output Matrix Partitioning for Outer-Product--Parallel Sparse Matrix-Matrix Multiplication

Abstract

Talk to us

Similar Papers

More From: SIAM journal on scientific computing : a publication of the Society for Industrial and Applied Mathematics

Lead the way for us

Journal: SIAM journal on scientific computing : a publication of the Society for Industrial and Applied Mathematics	Publication Date: Jan 1, 2014
Citations: 31

Similar Papers

A method of rule match conflict resolution for product configuration in manufacturing
Niya Li ... Yuhua Chen
International Journal of Computer Integrated Manufacturing | VOL. 22
Niya Li, et. al.Niya Li ... Yuhua Chen
01 Mar 2009
International Journal of Computer Integrated Manufacturing | VOL. 22

Solving of the Static Output Feedback Synthesis Problem in a Class of Block-Homogeneous Matrices of Input and Output
A V Mukhin
-
A V MukhinA V Mukhin
01 Jan 2021
01 Jan 2021

High-Performance and Memory-Saving Sparse General Matrix-Matrix Multiplication for NVIDIA Pascal GPU
Yusuke Nagasaka ... Satoshi Matsuoka
-
Yusuke Nagasaka, et. al.Yusuke Nagasaka ... Satoshi Matsuoka
01 Aug 2017
01 Aug 2017

Improved robust gain-scheduling static output-feedback control for discrete-time LPV systems
Márcia L.C Peixoto ... Reinaldo M Palhares
European Journal of Control | VOL. 58
Márcia L.C Peixoto, et. al.Márcia L.C Peixoto ... Reinaldo M Palhares
06 Jan 2021
European Journal of Control | VOL. 58

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Simultaneous Input and Output Matrix Partitioning for Outer-Product--Parallel Sparse Matrix-Matrix Multiplication

Abstract

Talk to us

Similar Papers

More From: SIAM journal on scientific computing : a publication of the Society for Industrial and Applied Mathematics