Eyeriss

Yu-Hsin Chen,Vivienne Sze,Joel Emer

doi:10.1145/3007787.3001177

Abstract

Deep convolutional neural networks (CNNs) are widely used in modern AI systems for their superior accuracy but at the cost of high computational complexity. The complexity comes from the need to simultaneously process hundreds of filters and channels in the high-dimensional convolutions, which involve a significant amount of data movement. Although highly-parallel compute paradigms, such as SIMD/SIMT, effectively address the computation requirement to achieve high throughput, energy consumption still remains high as data movement can be more expensive than computation. Accordingly, finding a dataflow that supports parallel processing with minimal data movement cost is crucial to achieving energy-efficient CNN processing without compromising accuracy. In this paper, we present a novel dataflow, called row-stationary (RS), that minimizes data movement energy consumption on a spatial architecture. This is realized by exploiting local data reuse of filter weights and feature map pixels, i.e., activations, in the high-dimensional convolutions, and minimizing data movement of partial sum accumulations. Unlike dataflows used in existing designs, which only reduce certain types of data movement, the proposed RS dataflow can adapt to different CNN shape configurations and reduces all types of data movement through maximally utilizing the processing engine (PE) local storage, direct inter-PE communication and spatial parallelism. To evaluate the energy efficiency of the different dataflows, we propose an analysis framework that compares energy cost under the same hardware area and processing parallelism constraints. Experiments using the CNN configurations of AlexNet show that the proposed RS dataflow is more energy efficient than existing dataflows in both convolutional (1.4× to 2.5×) and fully-connected layers (at least 1.3× for batch size larger than 16). The RS dataflow has also been demonstrated on a fabricated chip, which verifies our energy analysis.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Eyeriss

Abstract

Talk to us

Similar Papers

More From: ACM SIGARCH Computer Architecture News

Lead the way for us

Journal: ACM SIGARCH Computer Architecture News	Publication Date: Jun 18, 2016
Citations: 1188

Similar Papers

A Reconfigurable Spatial Architecture for Energy-Efficient Inception Neural Networks
Lichuan Luo ... He Zhang
IEEE Journal on Emerging and Selected Topics in Circuits and Systems | VOL. 13
Lichuan Luo, et. al.Lichuan Luo ... He Zhang
01 Mar 2023
IEEE Journal on Emerging and Selected Topics in Circuits and Systems | VOL. 13

Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks
Yu-Hsin Chen ... Joel S Emer
IEEE Journal of Solid-State Circuits | VOL. 52
Yu-Hsin Chen, et. al.Yu-Hsin Chen ... Joel S Emer
01 Jan 2017
IEEE Journal of Solid-State Circuits | VOL. 52

Cyclic Sparsely Connected Architectures for Compact Deep Convolutional Neural Networks
Morteza Hosseini ... Houman Homayoun
IEEE Transactions on Very Large Scale Integration (VLSI) Systems | VOL. 29
Morteza Hosseini, et. al.Morteza Hosseini ... Houman Homayoun
01 Oct 2021
IEEE Transactions on Very Large Scale Integration (VLSI) Systems | VOL. 29

Research on improved convolutional wavelet neural network
Jingwei Liu ... Jiaxin Li
Scientific Reports | VOL. 11
Jingwei Liu, et. al.Jingwei Liu ... Jiaxin Li
09 Sep 2021
Scientific Reports | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Eyeriss

Abstract

Talk to us

Similar Papers

More From: ACM SIGARCH Computer Architecture News