Power-Efficient Deep Convolutional Neural Network Design Through Zero-Gating PEs and Partial-Sum Reuse Centric Dataflow

Lin Ye,Masao Yanagisawa,Jinghao Ye,Youhua Shi

doi:10.1109/access.2021.3053259

Abstract

Convolution neural networks (CNNs) have shown great success in many areas such as object detection and pattern recognition at the cost of extreme high computation complexity and significant external memory access, which makes state-of-the-art deep CNNs difficult to be implemented on resource-constrained portable/wearable devices with limited capacity of battery. To address this design challenge, a power-efficient CNN design through zero-gating processing elements (PEs) and partial-sum reuse centric dataflow is proposed in this paper. Unlike the existing works which either only consider the zeros in activation maps or use off-chip training process for on-chip computation reduction, a zero-gating PE design is proposed to avoid unnecessary on-chip computation by taking advantages of the large number of zeros in both the filter’s weights of pre-trained models and the activation maps. Furthermore, a partial-sum reuse centric dataflow is also proposed for off-chip DRAM access reduction. The evaluation results show that the overall power consumption of PE arrays with our proposal can be reduced by 37% and 14% at the cost of 8% and 1% area overhead when compared to the baseline PE design and the existing only-activation-gated design (i.e. that in Eyeriss), respectively. Moreover, the proposed method can achieve 35% and 47% DRAM access reduction with the corresponding 14% and 49% energy savings for AlexNet and VGG-16 when compared to that in Eyeriss.

Highlights

Convolutional Neural Networks (CNNs) have made great contributions to computer vision [2], and AlexNet [3], VGG [4] and ResNet [5] are some of the most popular models that have been widely used in image classification
The memory system is evaluated, in which memory access including both DRAM and SRAM, and energy consumption of three CNNs are evaluated with CACTI tool [22]
The size of on-chip SRAM and off-chip DRAM in the proposed method are set as 256KB and 256MB, respectively

Summary

Introduction

Convolutional Neural Networks (CNNs) have made great contributions to computer vision [2], and AlexNet [3], VGG [4] and ResNet [5] are some of the most popular models that have been widely used in image classification. The proposed method considers the zeros in activation maps as well, so it can avoid unnecessary on-chip computations and achieve more power reduction compared with existing work [7].

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2021
Citations: 21	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Power-Efficient Deep Convolutional Neural Network Design Through Zero-Gating PEs and Partial-Sum Reuse Centric Dataflow

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

A Zero-Gating Processing Element Design for Low-Power Deep Convolutional Neural Networks
Lin Ye ... Masao Yanagisawa
-
Lin Ye, et. al.Lin Ye ... Masao Yanagisawa
01 Nov 2019
01 Nov 2019

SG-Float: Achieving Memory Access and Computing Power Reduction Using Self-Gating Float in CNNs
Jun-Shen Wu ... Ren-Shuo Liu
ACM Transactions on Embedded Computing Systems | VOL. 22
Jun-Shen Wu, et. al.Jun-Shen Wu ... Ren-Shuo Liu
09 Nov 2023
ACM Transactions on Embedded Computing Systems | VOL. 22

Scalable Object Detection Using Deep but Lightweight CNN with Features Fusion
Qiaosong Chen ... Xin Deng
-
Qiaosong Chen, et. al.Qiaosong Chen ... Xin Deng
01 Jan 2017
01 Jan 2017

Low-dose CT denoising via CNN trained using images with activation map
Minah Han ... Claudia R Mello-Thoms
-
Minah Han, et. al.Minah Han ... Claudia R Mello-Thoms
04 Apr 2022
04 Apr 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Power-Efficient Deep Convolutional Neural Network Design Through Zero-Gating PEs and Partial-Sum Reuse Centric Dataflow

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access