Resistive random-access memory (RRAM) based non-volatile computing-in-memory (nvCIM) has been regarded as a promising solution to enable efficient data-intensive artificial intelligence (AI) applications on resource-limited edge systems. However, existing weighted-current summation-based nvCIM suffers from device non-idealities and significant time, storage, and energy overheads when processing high-precision analog signals. To address these issues, we propose a 3T2R digital nvCIM macro for a fully hardware-implemented binary convolutional neural network (HBCNN), focusing on accelerating edge AI applications at low weight precision. By quantizing the voltage-division results of RRAMs through inverters, the 3T2R macro provides a stable rail-to-rail output without analog-to-digital converters or sensing amplifiers. Moreover, both batch normalization and sign activation are integrated on-chip. The hybrid simulation results show that the proposed 3T2R digital macro achieves an 86.2% (95.6%) accuracy on the CIFAR-10 (MNIST) dataset, corresponding to a 4.7% (1.9%) accuracy loss compared to the software baselines, which also feature a peak energy efficiency of 51.3 TOPS/W and a minimum latency of 8 ns, realizing an energyefficient, low-latency, and robust AI processor.
Read full abstract