Abstract

This article presents a communication-aware processing-in-memory deep neural network accelerator, which implements an in-memory entry-counting scheme for low bit-width quantized multiplication-and-accumulations (MACs). To maintain good accuracy on ImageNet, the proposed design adopts a full-stack co-design methodology, from algorithms, circuits to architectures. In the algorithm level, an entry-counting based MAC is proposed to fit the learned step-sized quantization scheme, and exploit the sparsity of both activations and weights intrinsically. In the circuit level, content addressable memory cells and multiplexed arrays are developed in the processing-in-memory macro. In the architecture level, the proposed design is compatible with different stationary dataflow mappings, further reducing the memory access. An in-memory entry-counting silicon prototype and its entire peripheral circuits are fabricated in 65nm LP CMOS technology with an active area of $0.76\times 0.66$ mm2. The 7.36-Kb processing-in-memory macro with 128 search entries can reduce the multiplication number by $12.8\times $ . The peak throughput is 3.58 GOPS, achieved at a clock rate of 143MHz and a power supply of 1.23V. The peak energy efficiency of the processing-in-memory macro is 11.6 TOPS/W, achieved at a clock rate of 40MHz and a power supply of 1.01V. Note that the physical design of the entry-counting memory is completed in a standard digital placement and routing flow by augmenting the library with two dedicated memory cells. A 3-bit quantized ResNet-18 on the ImageNet dataset is performed, where the top-1 accuracy is 64.4%.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.