Abstract

Convolution neural networks (CNNs) have shown great success in many areas such as object detection and pattern recognition at the cost of extreme high computation complexity and significant external memory access, which makes state-of-the-art deep CNNs difficult to be implemented on resource-constrained portable/wearable devices with limited capacity of battery. To address this design challenge, a power-efficient CNN design through zero-gating processing elements (PEs) and partial-sum reuse centric dataflow is proposed in this paper. Unlike the existing works which either only consider the zeros in activation maps or use off-chip training process for on-chip computation reduction, a zero-gating PE design is proposed to avoid unnecessary on-chip computation by taking advantages of the large number of zeros in both the filter’s weights of pre-trained models and the activation maps. Furthermore, a partial-sum reuse centric dataflow is also proposed for off-chip DRAM access reduction. The evaluation results show that the overall power consumption of PE arrays with our proposal can be reduced by 37% and 14% at the cost of 8% and 1% area overhead when compared to the baseline PE design and the existing only-activation-gated design (i.e. that in Eyeriss), respectively. Moreover, the proposed method can achieve 35% and 47% DRAM access reduction with the corresponding 14% and 49% energy savings for AlexNet and VGG-16 when compared to that in Eyeriss.

Highlights

  • Convolutional Neural Networks (CNNs) have made great contributions to computer vision [2], and AlexNet [3], VGG [4] and ResNet [5] are some of the most popular models that have been widely used in image classification

  • The memory system is evaluated, in which memory access including both DRAM and SRAM, and energy consumption of three CNNs are evaluated with CACTI tool [22]

  • The size of on-chip SRAM and off-chip DRAM in the proposed method are set as 256KB and 256MB, respectively

Read more

Summary

Introduction

Convolutional Neural Networks (CNNs) have made great contributions to computer vision [2], and AlexNet [3], VGG [4] and ResNet [5] are some of the most popular models that have been widely used in image classification. The proposed method considers the zeros in activation maps as well, so it can avoid unnecessary on-chip computations and achieve more power reduction compared with existing work [7].

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.