Abstract

Exploiting spatial and temporal localities is investigated for efficient row-by-row parallelization of general sparse matrix-matrix multiplication (SpGEMM) operation of the form $C=A\,B$ on many-core architectures. Hypergraph and bipartite graph models are proposed for 1D rowwise partitioning of matrix $A$ to evenly partition the work across threads with the objective of reducing the number of $B$ -matrix words to be transferred from the memory and between different caches. A hypergraph model is proposed for $B$ -matrix column reordering to exploit spatial locality in accessing entries of thread-private temporary arrays, which are used to accumulate results for $C$ -matrix rows. A similarity graph model is proposed for $B$ -matrix row reordering to increase temporal reuse of these accumulation array entries. The proposed models and methods are tested on a wide range of sparse matrices from real applications and the experiments were carried on a 60-core Intel Xeon Phi processor, as well as a two-socket Xeon processor. Results show the validity of the models and methods proposed for enhancing the locality in parallel SpGEMM operations.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.