ESSA: An energy-Aware bit-Serial streaming deep convolutional neural network accelerator

Lien-Chih Hsu,Hsing-Huan Chou,Yen-Yu Pu,Ching-Te Chiu,Kuan-Ting Lin

doi:10.1016/j.sysarc.2020.101831

Lien-Chih Hsu, Hsing-Huan Chou + Show 3 more

Open Access

https://doi.org/10.1016/j.sysarc.2020.101831

Copy DOI

Journal: Journal of Systems Architecture	Publication Date: Jul 3, 2020
Citations: 15	License type: cc-by-nc-nd

Affiliation: National Tsing Hua University

Abstract

Over the past decade, deep convolutional neural networks (CNN) have been widely embraced in various visual recognition applications owing to their extraordinary accuracy. However, their high computational complexity and excessive data storage present two challenges when designing CNN hardware. In this paper, we propose an energy-aware bit-serial streaming deep CNN accelerator to tackle these challenges. Using ring streaming dataflow and the output reuse strategy to decrease data access, the amount of external DRAM access for the convolutional layers is reduced by 357.26x when compared with that of no output reuse case on AlexNet. We optimize the hardware utilization and avoid unnecessary computations using the loop tiling technique and by mapping the strides of the convolutional layers to unit-ones for computational performance enhancement. In addition, the bit-serial processing element (PE) is designed to use fewer bits in weights, which can reduce both the amount of computation and external memory access. We evaluate our design using the well-known roofline model. The design space is explored to find the solution with the best computational performance and communication to computation (CTC) ratio. We can reach 1.36x speed and reduce energy consumption by 41% for external memory access compared with the design in [1]. The hardware implementation for our PE Array architecture design can reach an operating frequency of 119 MHz and consumes 68 k gates with a power consumption of 10.08 mW using TSMC 90-nm technology. Compared to the 15.4 MB external memory access for Eyeriss [2] on the convolutional layers of AlexNet, our method only requires 4.36 MB of external memory access to dramatically reduce the costliest portion of power consumption.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

ESSA: An energy-Aware bit-Serial streaming deep convolutional neural network accelerator

Abstract

Talk to us

Similar Papers

More From: Journal of Systems Architecture

Lead the way for us

Similar Papers

An Energy-Aware Bit-Serial Streaming Deep Convolutional Neural Network Accelerator
Lien-Chih Hsu ... Ching-Te Chiu
-
Lien-Chih Hsu, et. al.Lien-Chih Hsu ... Ching-Te Chiu
01 Sep 2019
01 Sep 2019

Low-Power Scalable 3-D Face Frontalization Processor for CNN-Based Face Recognition in Mobile Devices
Sanghoon Kang ... Changhyeon Kim
IEEE Journal on Emerging and Selected Topics in Circuits and Systems | VOL. 8
Sanghoon Kang, et. al.Sanghoon Kang ... Changhyeon Kim
01 Dec 2018
IEEE Journal on Emerging and Selected Topics in Circuits and Systems | VOL. 8

STC: Significance-aware Transform-based Codec Framework for External Memory Access Reduction
Feng Xiong ... Shouyi Yin
-
Feng Xiong, et. al.Feng Xiong ... Shouyi Yin
01 Jul 2020
01 Jul 2020

Low Energy Domain Wall Memory Based Convolution Neural Network Design with Optimizing MAC Architecture
Jooyoon Kim ... Jongsun Park
-
Jooyoon Kim, et. al.Jooyoon Kim ... Jongsun Park
01 May 2021
01 May 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

ESSA: An energy-Aware bit-Serial streaming deep convolutional neural network accelerator

Abstract

Talk to us

Similar Papers

More From: Journal of Systems Architecture