Abstract

Analog compute-in-memory with resistive random access memory (RRAM) devices promises to overcome the data movement bottleneck in data-intensive artificial intelligence (AI) and machine learning. RRAM crossbar arrays improve the efficiency of vector-matrix multiplications (VMMs), which is a vital operation in these applications. The prototype IC is the first complete, fully integrated analog-RRAM CMOS coprocessor. This article focuses on the digital and analog circuitry that supports efficient and flexible RRAM-based computation. A passive $54\times108$ RRAM crossbar array performs VMM in the analog domain. Specialized mixed-signal circuits stimulate and read the outputs of the RRAM crossbar. The single-chip CMOS prototype includes a reduced instruction set computer (RISC) processor interfaced to a memory-mapped mixed-signal core. In the mixed-signal core, ADCs and DACs interface with the passive RRAM crossbar. The RISC processor controls the mixed-signal circuits and the algorithm data path. The system is fully programmable and supports forward and backward propagation. As proof of concept, a fully integrated 0.18- $\mu \text{m}$ CMOS prototype with a postprocessed RRAM array demonstrates several key functions of machine learning, including online learning. The mixed-signal core consumes 64 mW at an operating frequency of 148 MHz. The total system power consumption considering the mixed-signal circuitry, the digital processor, and the passive RRAM array is 307 mW. The maximum theoretical throughput is 2.6 GOPS at an efficiency of 8.5 GOPS/W.

Highlights

  • T HE energy consumption of data movement tasks behind artificial intelligence (AI) and machine learning presents a significant bottleneck in system performance

  • Correll et al.: Fully Integrated Reprogrammable CMOS-resistive random access memory (RRAM) Compute-in-Memory Coprocessor applied to the crossbar rows to perform analog vector-matrix multiplications (VMMs), and the resulting column currents are measured

  • MEASUREMENT SETUP The prototype RRAM coprocessor is wirebonded to a 391-pin PGA package for testing and measurement

Read more

Summary

INTRODUCTION

T HE energy consumption of data movement tasks behind artificial intelligence (AI) and machine learning presents a significant bottleneck in system performance. Mixed-signal solutions using passive and active resistive random access memory (RRAM) crossbar arrays have been demonstrated (see [3]–[5]) but require discrete or external peripheral devices for operation. Correll et al.: Fully Integrated Reprogrammable CMOS-RRAM Compute-in-Memory Coprocessor applied to the crossbar rows to perform analog VMM, and the resulting column currents are measured. The column currents flowing into virtual grounds are the vector product of the row voltages and RRAM conductances. The RRAM bitcells are programmed using dedicated write DACs. For maximum throughput, the array should operate in parallel; each of the rows and columns of the crossbar has dedicated hardware—ADC, read DAC, and write DAC. To enable a wide range of algorithms, our architecture supports transverse operation, where the input is applied to the columns, and the current is read from the rows, terminated to virtual grounds.

RRAM COPROCESSOR
PULSE-DOMAIN WRITE AND READ DACs
ACTIVE VIRTUAL GROUND AND INTEGRATING
ACTIVE VIRTUAL-GROUND WITH
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call