SuperSlash: A Unified Design Space Exploration and Model Compression Methodology for Design of Deep Learning Accelerators With Reduced Off-Chip Memory Access Volume

Hazoor Ahmad,Muhammad Abdullah Hanif,Tabasher Arif,Muhammad Shafique,Rehan Hafiz

doi:10.1109/tcad.2020.3012865

Abstract

Deploying deep learning (DL) models on resource-constrained embedded devices is a challenging task. The limited on-chip memory on such devices results in increased off-chip memory access volume, thus limiting the size of DL models that can be efficiently realized in such systems. Design space exploration (DSE) under memory constraint, or to achieve minimal off-chip memory access volume, has recently received much attention. Unfortunately, DSE alone cannot reduce the amount of off-chip memory accesses beyond a certain point due to the fixed model size. Model compression via pruning can be employed to reduce the size of the model and the associated off-chip memory accesses. However, in this article, we demonstrate that pruned models with even the same accuracy and model size may require a different number of off-chip memory accesses depending upon the pruning strategy adopted. Thus, mainstream pruning techniques may not be closely tied to the design goals, and thereby hard to be integrated with existing DSE techniques. To overcome this problem, we propose SuperSlash, a unified solution for DSE and model compression. SuperSlash estimates off-chip memory access volume overhead of each layer of a DL model by exploring multiple design candidates. In particular, it evaluates multiple data reuse strategies for each layer, along with the possibility of layer fusion. Layer fusion aims at reducing the off-chip memory access volume by avoiding the intermediate off-chip storage of a layer's output and directly using it for processing of the subsequent layer. SuperSlash then guides the pruning process via a ranking function, which ranks each layer according to its explored off-chip memory access cost. We demonstrate that SuperSlash not only offers an extensive design space coverage but also provides lower off-chip memory access volume (up to 57.71%, 25.83%, 47.73%, and 29.02% reduction for VGG16, ResNet56, ResNet110, and MobileNetV1, respectively) as compared to the state-of-art.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

SuperSlash: A Unified Design Space Exploration and Model Compression Methodology for Design of Deep Learning Accelerators With Reduced Off-Chip Memory Access Volume

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Lead the way for us

Journal: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems	Publication Date: Oct 2, 2020
Citations: 40

Similar Papers

GANDSE: Generative Adversarial Network-based Design Space Exploration for Neural Network Accelerator Design
Lang Feng ... Wenjian Liu
ACM Transactions on Design Automation of Electronic Systems | VOL. 28
Lang Feng, et. al.Lang Feng ... Wenjian Liu
19 Mar 2023
ACM Transactions on Design Automation of Electronic Systems | VOL. 28

Accelerating FPGA design space exploration using circuit similarity-based placement
Xiaoyu Shi ... Guohui Lin
-
Xiaoyu Shi, et. al.Xiaoyu Shi ... Guohui Lin
01 Dec 2010
01 Dec 2010

Design and Evolution of Cyber Physical Systems: A Dynamic Data Driven Application System
...
-
, et. al. ...
16 Dec 2015
16 Dec 2015

A Flexible Design Automation Tool for Accelerating Quantized Spectral CNNs
Rachit Rajat ... Hanqing Zeng
-
Rachit Rajat, et. al.Rachit Rajat ... Hanqing Zeng
01 Sep 2019
01 Sep 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

SuperSlash: A Unified Design Space Exploration and Model Compression Methodology for Design of Deep Learning Accelerators With Reduced Off-Chip Memory Access Volume

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems