Benchmark of the Compute-in-Memory-Based DNN Accelerator With Area Constraint

Anni Lu,Shimeng Yu,Xiaochen Peng,Yandong Luo

doi:10.1109/tvlsi.2020.3001526

Abstract

Compute-in-memory (CIM) is a promising computing paradigm to accelerate the inference of deep neural network (DNN) algorithms due to its high processing parallelism and energy efficiency. Prior CIM-based DNN accelerators mostly consider full custom design, which assumes that all the weights are stored on-chip. For lightweight smart edge devices, this assumption may not hold. In this article, CIM-based DNN accelerators are designed and benchmarked under different chip area constraints. First, a scheduling strategy and dataflow for DNN inference is investigated when only part of the weights can be stored on-chip. Two weight reload schemes are evaluated: 1) reload partial weights and reuse input/output feature maps and 2) load a batch of input and reuse the partial weights on-chip across the batch. Then, system-level performance benchmark is performed for the inference of ResNet-18 on ImageNet data set. The design tradeoffs with different area constraints, dataflow, and device technologies [static random access memory (SRAM) versus ferroelectric field-effect transistor (FeFET)] are discussed.

Full Text