VerSA: Versatile Systolic Array Architecture for Sparse and Dense Matrix Multiplications

Juwon Seo,Joonho Kong

doi:10.3390/electronics13081500

Abstract

A key part of modern deep neural network (DNN) applications is matrix multiplication. As DNN applications are becoming more diverse, there is a need for both dense and sparse matrix multiplications to be accelerated by hardware. However, most hardware accelerators are designed to accelerate either dense or sparse matrix multiplication. In this paper, we propose VerSA, a versatile systolic array architecture for both dense and sparse matrix multiplications. VerSA employs intermediate paths and SRAM buffers between the rows of the systolic array (SA), thereby enabling an early termination in sparse matrix multiplication with a negligible performance overhead when running dense matrix multiplication. When running sparse matrix multiplication, 256 × 256 VerSA brings performance (i.e., an inverse of execution time) improvement and energy saving by 1.21×–1.60× and 7.5–30.2%, respectively, when compared to the conventional SA. When running dense matrix multiplication, VerSA results in only a 0.52% performance overhead compared to the conventional SA.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

VerSA: Versatile Systolic Array Architecture for Sparse and Dense Matrix Multiplications

Abstract

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Journal: Electronics	Publication Date: Apr 15, 2024
License type: CC BY 4.0

Similar Papers

HIRAC: A Hierarchical Accelerator with Sorting-based Packing for SpGEMMs in DNN Applications
Hesam Shabani ... Xiaochen Guo
-
Hesam Shabani, et. al.Hesam Shabani ... Xiaochen Guo
01 Feb 2023
01 Feb 2023

Semi-External Memory Sparse Matrix Multiplication for Billion-Node Graphs
Da Zheng ... Carey E Priebe
IEEE Transactions on Parallel and Distributed Systems | VOL. 28
Da Zheng, et. al.Da Zheng ... Carey E Priebe
14 Oct 2016
IEEE Transactions on Parallel and Distributed Systems | VOL. 28

Register Tiling for Unstructured Sparsity in Neural Network Inference
Lucas Wilkinson ... Kazem Cheshmi
Proceedings of the ACM on Programming Languages | VOL. 7
Lucas Wilkinson, et. al.Lucas Wilkinson ... Kazem Cheshmi
06 Jun 2023
Proceedings of the ACM on Programming Languages | VOL. 7

Space-round tradeoffs for MapReduce computations
Andrea Pietracaprina ... Francesco Silvestri
-
Andrea Pietracaprina, et. al.Andrea Pietracaprina ... Francesco Silvestri
25 Jun 2012
25 Jun 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

VerSA: Versatile Systolic Array Architecture for Sparse and Dense Matrix Multiplications

Abstract

Talk to us

Similar Papers

More From: Electronics