Abstract

Embedded nonvolatile memory (NVM) and computing-in-memory (CIM) are significantly reducing the latency (t MAC ) and energy consumption (E MAC ) of multiply- and-accumulate (MAC) operations in artificial intelligence (AI) edge devices [1, 2]. Previous ReRAM CIM macros demonstrated MAC operations for lb-input, ternary- weighted, 3b-output CNNs [1] or lb-input, 8b-weighted, 1b-output fully-connected networks with limited accuracy [2]. To support higher-accuracy convolution neural network heavy applications NVM-CIM should support multibit inputs/weights and multi-bit output (MAC-OUT) for CNN operations. One way to achieve multibit weights is to use a multi-level ReRAM cell to store the weight. However, as shown in Fig. 24.1.1, multibit ReRAM CIM faces several challenges. (1) a tradeoff between area and speed for multibit input/weight/MAC-OUT MAC operations; (2) sense amplifier’s high input offset, large area, and high parasitic load on the read-path due to large BL currents (I BL ) from multibit MAC; (3) limited accuracy due to a small read/sensing margin (I SM ) across MAC-OUT or variation in cell resistance (particularly MLC cells). To overcome these challenges, this work proposes, (1) a serial-input non-weighted product (SINWP) structure to optimize the tradeoff between area, t MAC and E MAC , (2) a down-scaling weighted current translator (DSWCT) and positive-negative current- subtractor (PN-ISUB) for short delay, a small offset and a compact read-path area; and (3) a triple-margin small-offset current-mode sense amplifier (TMCSA) to tolerate a small I SM . A fabricated 55nm 1Mb ReRAM-CIM macro is the first ReRAM CIM macro to support CNN operations using multibit input/weight MAC-OUT. This device achieves the shortest CIM-MAC-access time (t AC ) among existing ReRAM-CIMs (t MAC =14.6ns with 2b-input, 3b-weight with 4b-MAC-OUT) and the best peak E MAC of 53.17 TOPS/W (in binary mode).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call