A Relaxed Quantization Training Method for Hardware Limitations of Resistive Random Access Memory (ReRAM)-Based Computing-in-Memory

Wei-Chen Wei,Cheng-Xin Xue,Hao-Wen Kuo,Jye-Luen Lee,Chih-Cheng Lu,Yi-Ren Chen,Meng-Fan Chang,Syuan-Hao Sie,Kea-Tiong Tang,Chuan-Jia Jhang

doi:10.1109/jxcdc.2020.2992306

Abstract

Nonvolatile computing-in-memory (nvCIM) exhibits high potential for neuromorphic computing involving massive parallel computations and for achieving high energy efficiency. nvCIM is especially suitable for deep neural networks, which are required to perform large amounts of matrix–vector multiplications. However, a comprehensive quantization algorithm has yet to be developed, which overcomes the hardware limitations of resistive random access memory (ReRAM)-based nvCIM, such as the number of I/O, word lines (WLs), and ADC outputs. In this article, we propose a quantization training method for compressing deep models. The method comprises three steps: input and weight quantization, ReRAM convolution (ReConv), and ADC quantization. ADC quantization optimizes the error sampling problem by using the Gumbel-softmax trick. Under a 4-bit ADC of nvCIM, the accuracy only decreases by 0.05% and 1.31% for the MNIST and CIFAR-10, respectively, compared with the corresponding accuracies obtained under an ideal ADC. The experimental results indicate that the proposed method is effective for compensating the hardware limitations of nvCIM macros.

Highlights

D EEP neural networks (DNNs) have highly flexible parametric properties, and these properties are being exploited to develop artificial intelligence (AI) applications in various domains ranging from cloud computing to edge computing
Under a 4-bit ADC of Nonvolatile computing-in-memory (nvCIM), the accuracy only decreases by 0.05% and 1.31% for the MNIST and CIFAR-10, respectively, compared with the corresponding accuracies obtained under an ideal ADC
According to the analysis of nvCIM, we propose a quantization scheme that accounts for the hardware limitations of nvCIM

Summary

INTRODUCTION

D EEP neural networks (DNNs) have highly flexible parametric properties, and these properties are being exploited to develop artificial intelligence (AI) applications in various domains ranging from cloud computing to edge computing. The techniques for achieving these improvements involve the design of the entire nvCIM macro, physical characteristics of ReRAM, and precision of ADC outputs. Considering the input pattern and process variation discussed earlier, achieving a higher precision of input would lead to very close distributions of IBL when the MAC value is high [see Fig. 3(b)]. One is to use multiple single-levelcell (SLC) ReRAM to represent one weight value This increases the area cost and complexity of the MAC operation, which is related to the power and latency of the entire input process. With increasing IBL due to multibit MAC operations, the input offset of ADC increases, which reduces the accuracy of the sensing output [see Fig. 4(a)]. Due to the limitation of a ReRAM cell and model accuracy, the proposed weight quantization focuses on the design of SLC ReRAM.

ReRAM CONVOLUTION

5: Optionally apply pooling

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Journal on Exploratory Solid-State Computational Devices and Circuits	Publication Date: Jun 1, 2020
Citations: 19	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Relaxed Quantization Training Method for Hardware Limitations of Resistive Random Access Memory (ReRAM)-Based Computing-in-Memory

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Journal on Exploratory Solid-State Computational Devices and Circuits

Lead the way for us

Similar Papers

(Invited) Material and Device Design for Practical Application of Highly-Reliable and High-Density ReRAM
Ryutaro Yasuhara
Electrochemical Society Meeting Abstracts | VOL. MA2016-02
Ryutaro YasuharaRyutaro Yasuhara
01 Sep 2016
Electrochemical Society Meeting Abstracts | VOL. MA2016-02

Mechanism Analysis of Nonvolatile Graphene Oxide Based Reram with Laterally Structured Device
Youngmin Park ... Jeeyoung Yoo
Electrochemical Society Meeting Abstracts | VOL. MA2014-02
Youngmin Park, et. al.Youngmin Park ... Jeeyoung Yoo
05 Aug 2014
Electrochemical Society Meeting Abstracts | VOL. MA2014-02

Effect of transition metal element X (X=Mn, Fe, Co, and Ni) doping on performance of ZnO resistive memory
Guo Jia-Jun ... Zhao Xu
Acta Physica Sinica | VOL. 67
Guo Jia-Jun, et. al. Guo Jia-Jun ... Zhao Xu
01 Jan 2018
Acta Physica Sinica | VOL. 67

A Novel March C2RR Algorithm for Nanoelectronic Resistive Random Access Memory (RRAM) Testing
H Sribhuvaneshwari ... K Suthendran
-
H Sribhuvaneshwari, et. al.H Sribhuvaneshwari ... K Suthendran
01 Jan 2019
01 Jan 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Relaxed Quantization Training Method for Hardware Limitations of Resistive Random Access Memory (ReRAM)-Based Computing-in-Memory

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Journal on Exploratory Solid-State Computational Devices and Circuits