Abstract

Due to its ultrahigh density and commercially matured fabrication technology, 3-D NAND flash memory has been proposed as an attractive candidate of inference engine for deep neural network (DNN) workloads. However, the peripheral circuits require to be modified with conventional 3-D NAND flash to enable compute-in-memory (CIM), and the chip architectures need to be redesigned for an optimized dataflow. In this work, we present a design of 3-D NAND-CIM accelerator based on the macro parameters from an industry-grade prototype chip. The DNN inference performance is evaluated using the DNN+NeuroSim framework. To exploit the ultrahigh density of 3-D NAND flash, both inputs and weights mapping strategies are introduced to improve the throughput. The benchmarking on the VGG network was performed across the technological candidates for CIM, including SRAM, resistive random access memory (RRAM), and 3-D NAND. Compared to the similar designs with SRAM or RRAM, the result shows that the 3-D NAND-based CIM design can achieve not only 17%–24% chip size but also 1.9–2.7 times more competitive energy efficiency for 8-bit precision inference. Inference accuracy drop induced by 3-D NAND string current drift and variation is also investigated. No accuracy degradation by current variation was observed with the proposed input mapping scheme, while accuracy drops sensitive to the current drift, which implies that some compensation schemes are needed to maintain the inference accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call