Abstract

On-chip training of large-scale deep neural networks (DNNs) is challenging. To solve the memory wall problem, compute-in-memory (CIM) is a promising approach that exploits the analog computation inside the memory array to speed up the vector-matrix multiplication (VMM). Challenges for on-chip CIM training include higher weight precision and higher analog-to-digital converter (ADC) resolution. In this work, we propose a mixed-precision RRAM-based CIM architecture that overcomes these challenges and supports on-chip training. In particular, we split the multi-bit weight into the most significant bits (MSBs) and the least significant bits (LSBs). The forward and backward propagations are performed with CIM transposable arrays for MSBs only, while the weight update is performed in regular memory arrays that store LSBs. Impact of ADC resolution on training accuracy is analyzed. We explore the training performance of a convolutional VGG-like network on the CIFAR-10 dataset using this Mixed-precision IN-memory Training architecture, namely MINT, showing that it can achieve ~91% accuracy under hardware constraints and ~4.46TOPS/W energy efficiency. Compared with the baseline CIM architectures based on RRAM, it can achieve 1.35× higher energy efficiency and only 31.9% chip size (~98.86 mm <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> at 32 nm node).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call