SoBS-X: Squeeze-Out Bit Sparsity for ReRAM-Crossbar-Based Neural Network Accelerator

Fangxin Liu,Tao Yang,Zongwu Wang,Yongbiao Chen,Li Jiang,Zhezhi He,Xiaoyao Liang

doi:10.1109/tcad.2022.3172907

Abstract

Resistive random-access-memory (ReRAM) crossbar is a promising technique for deep neural network (DNN) accelerators, thanks to its in-memory and in-situ analog computing abilities for vector–matrix multiplication-and-accumulations (VMMs). However, it is challenging for crossbar architecture to exploit the sparsity in DNNs. It is inevitably complex and costly to exploit fine-grained sparsity due to the limitation of the tightly coupled crossbar structure. As a countermeasure, we develop a novel ReRAM-based DNN accelerator, named sparse-multiplication-engine (SME), based on a hardware and software co-design framework. First, we orchestrate the bit-sparse pattern to increase the density of bit-sparsity based on existing quantization methods. Such quantized weights can be nicely generated using the alternating direction method of multipliers (ADMM) optimization during the DNN fine-tuning, which can exactly enforce bit patterns in weights. Second, we propose a novel weight mapping mechanism to slice the bits of the weight across crossbars and splice the activation results in peripheral circuits. This mechanism can decouple the tightly coupled crossbar structure and cumulate the sparsity in the crossbar. Finally, a superior squeeze-out scheme empties the crossbars mapped with highly sparse nonzeros from the previous two steps. We design the SME architecture and discuss its use for other quantization methods and different ReRAM cell technologies. We further propose a workload grouping algorithm and a pipeline to achieve workload balance among crossbar-rows that concurrently execute multiply–accumulate operations to optimize the system latency. Putting all together, with the optimized model, compared with prior state-of-the-art designs, the SME shrinks the use of crossbars up to <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$8.7\times $ </tex-math></inline-formula> and <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$2.1\times $ </tex-math></inline-formula> using ResNet-50 and MobileNet-v2, respectively, and achieve average <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$3.1\times $ </tex-math></inline-formula> speed up with no or little accuracy loss on ImageNet.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

SoBS-X: Squeeze-Out Bit Sparsity for ReRAM-Crossbar-Based Neural Network Accelerator

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Lead the way for us

Journal: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems	Publication Date: Jan 1, 2023
Citations: 1

Similar Papers

SME: ReRAM-based Sparse-Multiplication-Engine to Squeeze-Out Bit Sparsity of Neural Network
Fangxin Liu ... Tao Yang
-
Fangxin Liu, et. al.Fangxin Liu ... Tao Yang
01 Oct 2021
01 Oct 2021

Control Variate Approximation for DNN Accelerators
Georgios Zervakis ... Ourania Spantidi
-
Georgios Zervakis, et. al.Georgios Zervakis ... Ourania Spantidi
05 Dec 2021
05 Dec 2021

RRAM-DNN: An RRAM and Model-Compression Empowered All-Weights-On-Chip DNN Accelerator
Ziyun Li ... Bowen Liu
IEEE Journal of Solid-state Circuits | VOL. 56
Ziyun Li, et. al.Ziyun Li ... Bowen Liu
01 Jan 2020
IEEE Journal of Solid-state Circuits | VOL. 56

ReCom: An efficient resistive accelerator for compressed deep neural networks
Houxiang Ji ... Yiran Chen
-
Houxiang Ji, et. al.Houxiang Ji ... Yiran Chen
01 Mar 2018
01 Mar 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

SoBS-X: Squeeze-Out Bit Sparsity for ReRAM-Crossbar-Based Neural Network Accelerator

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems