SparCE: Sparsity Aware General-Purpose Core Extensions to Accelerate Deep Neural Networks

Sanchari Sen,Anand Raghunathan,Swagath Venkataramani,Shubham Jain

doi:10.1109/tc.2018.2879434

Abstract

Deep Neural Networks (DNNs) have emerged as the method of choice for solving a wide range of machine learning tasks. The enormous computational demand posed by DNNs is a key challenge for computing system designers and has most commonly been addressed through the design of DNN accelerators. However, these specialized accelerators utilize large quantities of multiply-accumulate units and on-chip memory and are prohibitive in area and cost constrained systems such as wearable devices and IoT sensors. In this work, we take a complementary approach and improve the performance of DNNs on general-purpose processor (GPP) cores. We do so by exploiting a key attribute of DNNs, viz. sparsity or the prevalence of zero values. We propose Sparsity-aware Core Extensions (SparCE) - a set of low-overhead micro-architectural and ISA extensions that dynamically detect whether an operand (e.g., the result of a load instruction) is zero and subsequently skip a set of future instructions that use it. To maximize performance benefits, SparCE ensures that the instructions to be skipped are prevented from even being fetched, as squashing instructions comes with a penalty (e.g., a pipeline stall). SparCE consists of 2 key micro-architectural enhancements. First, a Sparsity Register File (SpRF) is utilized to track registers that are zero. Next, a Sparsity-Aware Skip Address (SASA) Table is used to indicate instruction sequences that can be skipped, and to specify conditions on SpRF registers that trigger instruction skipping. When an instruction is fetched, SparCE dynamically pre-identifies whether the following instruction(s) can be skipped, and if so appropriately modifies the program counter, thereby skipping the redundant instructions and improving performance. We model SparCE using the gem5 architectural simulator, and evaluate our approach on 6 state-of-the-art image-recognition DNNs in the context of both training and inference using the Caffe deep learning framework. On a scalar microprocessor, SparCE achieves 1.11×-1.96× speedups across both convolution and fully-connected layers that exhibit 10-90 percent sparsity. These speedups translate to 19-31 percent reduction in execution time at the overall application-level. We also evaluate SparCE on a 4-way SIMD ARMv8 processor using the OpenBLAS library, and demonstrate that SparCE achieves 8-15 percent reduction in the application-level execution time.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

SparCE: Sparsity Aware General-Purpose Core Extensions to Accelerate Deep Neural Networks

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computers

Lead the way for us

Journal: IEEE Transactions on Computers	Publication Date: Jun 1, 2019
Citations: 48

Similar Papers

Sparsity-Aware Caches to Accelerate Deep Neural Networks
Vinod Ganesan ... Neel Gala
-
Vinod Ganesan, et. al.Vinod Ganesan ... Neel Gala
01 Mar 2020
01 Mar 2020

Efficient and Robust Deep Learning through Approximate Computing

-

28 Jul 2020
28 Jul 2020

Performance analysis and optimization on the UCLA parallel atmospheric general circulation model code
John Lou ... John Farrara
-
John Lou, et. al.John Lou ... John Farrara
17 Nov 1996
17 Nov 1996

Performance analysis and optimization on a parallel atmospheric general circulation model code
John Z Lou ... John D Farrara
Concurrency: Practice and Experience | VOL. 10
John Z Lou, et. al.John Z Lou ... John D Farrara
01 Jun 1998
Concurrency: Practice and Experience | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

SparCE: Sparsity Aware General-Purpose Core Extensions to Accelerate Deep Neural Networks

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computers