Design Methodology towards High-Precision SRAM based Computation-in-Memory for AI Edge Devices

Tianzhu Xiong,Jun Yang,Yuyao Kong,Yufei Wang,An Guo,Xin Si,Yongliang Zhou,Bo Wang,Chen Xue,Haiming Hsu

doi:10.1109/isocc53507.2021.9613913

Abstract

Artificial Intelligence (AI) processors commonly use deep-learning-based neural networks (DNN), which mainly include convolution layers and fully connected (FC) layers. Both the convolution and FC layers require highly parallel multiply-and-accumulate (MAC) operations and generate a great deal of intermediate data. Under the von Neumann computing architecture, data transfer between processor and memory imposes high energy consumption and long latency, which significantly deteriorates the system’s performance and efficiency. Computation-in-memory (CIM) is a promising candidate to improve the energy efficiency of multiply-and accumulate (MAC) operations of artificial intelligence (AI) chips. However, there are always constraints between high precision and high energy efficiency in CIM. This paper reviews the precision requirements of popular DNN models, and outlines the tradeoff between the precision and the energy efficiency in SRAM-CIM designs.

Full Text