SNAP: An Efficient Sparse Neural Acceleration Processor for Unstructured Sparse Deep Neural Network Inference

Jie-Fang Zhang,Yakun Sophia Shao,Zhengya Zhang,Stephen W Keckler,Ching-En Lee,Chester Liu

doi:10.1109/jssc.2020.3043870

Abstract

Recent developments in deep neural network (DNN) pruning introduces data sparsity to enable deep learning applications to run more efficiently on resourceand energy-constrained hardware platforms. However, these sparse models require specialized hardware structures to exploit the sparsity for storage, latency, and efficiency improvements to the full extent. In this work, we present the sparse neural acceleration processor (SNAP) to exploit unstructured sparsity in DNNs. SNAP uses parallel associative search to discover valid weight (W) and input activation (IA) pairs from compressed, unstructured, sparse W and IA data arrays. The associative search allows SNAP to maintain a 75% average compute utilization. SNAP follows a channel-first dataflow and uses a two-level partial sum (psum) reduction dataflow to eliminate access contention at the output buffer and cut the psum writeback traffic by 22× compared with state-of-the-art DNN accelerator designs. SNAP's psum reduction dataflow can be configured in two modes to support general convolution (CONV) layers, pointwise CONV, and fully connected layers. A prototype SNAP chip is implemented in a 16-nm CMOS technology. The 2.3-mm2 test chip is measured to achieve a peak effectual efficiency of 21.55 TOPS/W (16 b) at 0.55 V and 260 MHz for CONV layers with 10% weight and activation densities. Operating on a pruned ResNet-50 network, the test chip achieves a peak throughput of 90.98 frames/s at 0.80 V and 480 MHz, dissipating 348 mW.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

SNAP: An Efficient Sparse Neural Acceleration Processor for Unstructured Sparse Deep Neural Network Inference

Abstract

Talk to us

Similar Papers

More From: IEEE Journal of Solid-State Circuits

Lead the way for us

Journal: IEEE Journal of Solid-State Circuits	Publication Date: Dec 29, 2020
Citations: 61

Similar Papers

Pruning for Hardware-Based Deep Spiking Neural Networks Using Gated Schottky Diode as Synaptic Devices.
Sung-Tae Lee ... Byung-Gook Park
Journal of Nanoscience and Nanotechnology | VOL. 20
Sung-Tae Lee, et. al.Sung-Tae Lee ... Byung-Gook Park
01 Nov 2020
Journal of Nanoscience and Nanotechnology | VOL. 20

Activation pruning of deep convolutional neural networks
Arash Ardakani ... Carlo Condo
-
Arash Ardakani, et. al.Arash Ardakani ... Carlo Condo
01 Nov 2017
01 Nov 2017

An Interactive Visualization for Feature Localization in Deep Neural Networks.
Martin Zurowietz ... Tim W Nattkemper
Frontiers in Artificial Intelligence | VOL. 3
Martin Zurowietz, et. al.Martin Zurowietz ... Tim W Nattkemper
23 Jul 2020
Frontiers in Artificial Intelligence | VOL. 3

An Enhanced Scheme for Reducing the Complexity of Pointwise Convolutions in CNNs for Image Classification Based on Interleaved Grouped Filters without Divisibility Constraints.
Joao Paulo Schwarz Schuler ... Domenec Puig
Entropy | VOL. 24
Joao Paulo Schwarz Schuler, et. al.Joao Paulo Schwarz Schuler ... Domenec Puig
08 Sep 2022
Entropy | VOL. 24

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

SNAP: An Efficient Sparse Neural Acceleration Processor for Unstructured Sparse Deep Neural Network Inference

Abstract

Talk to us

Similar Papers

More From: IEEE Journal of Solid-State Circuits