HD-CIM: Hybrid-Device Computing-In-Memory Structure Based on MRAM and SRAM to Reduce Weight Loading Energy of Neural Networks

He Zhang,Sai Li,Jinyu Bai,Wang Kang,Junzhan Liu,Jianxin Wu,Shaoqian Wei,Lichuan Luo

doi:10.1109/tcsi.2022.3199440

Abstract

SRAM based computing-in-memory (SRAM-CIM) techniques have been widely studied for neural networks (NNs) to solve the “Von Neumann bottleneck”. However, as the scale of the NN model increasingly expands, the weight cannot be fully stored on-chip owing to the big device size (limited capacity) of SRAM. In this case, the NN weight data have to be frequently loaded from external memories, such as DRAM and Flash memory, which results in high energy consumption and low efficiency. In this paper, we propose a hybrid-device computing-in-memory (HD-CIM) architecture based on SRAM and MRAM (magnetic random-access memory). In our HD-CIM, the NN weight data are stored in on-chip MRAM and are loaded into SRAM-CIM core, significantly reducing energy and latency. Besides, in order to improve the data transfer efficiency between MRAM and SRAM, a high-speed pipelined MRAM readout structure is proposed to reduce the BL charging time. Our results show that the NN weight data loading energy in our design is only 0.242 pJ/bit, which is 289 <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\times $ </tex-math></inline-formula> less in comparison with that from off-chip DRAM. Moreover, the energy breakdown and efficiency are analyzed based on different NN models, such as VGG19, ResNet18 and MobileNetV1. Our design can improve <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\mathbf {58\times \,\,to\,\,124\times }$ </tex-math></inline-formula> energy efficiency.

Full Text