Accurate Inference With Inaccurate RRAM Devices: A Joint Algorithm-Design Solution

Gouranga Charan,Rajiv V Joshi,Yu Cao,Xiaocong Du,Gokul Krishnan,Abinash Mohanty

doi:10.1109/jxcdc.2020.2987605

Abstract

Resistive random access memory (RRAM) is a promising technology for energy-efficient neuromorphic accelerators. However, when a pretrained deep neural network (DNN) model is programmed to an RRAM array for inference, the model suffers from accuracy degradation due to RRAM nonidealities, such as device variations, quantization error, and stuck-at-faults. Previous solutions involving multiple read–verify–write (R-V-W) to the RRAM cells require cell-by-cell compensation and, thus, an excessive amount of processing time. In this article, we propose a joint algorithm-design solution to mitigate the accuracy degradation. We first leverage knowledge distillation (KD), where the model is trained with the RRAM nonidealities to increase the robustness of the model under device variations. Furthermore, we propose random sparse adaptation (RSA), which integrates a small on-chip memory with the main RRAM array for postmapping adaptation. Only the on-chip memory is updated to recover the inference accuracy. The joint algorithm-design solution achieves the state-of-the-art accuracy of 99.41% for MNIST (LeNet-5) and 91.86% for CIFAR-10 (VGG-16) with up to 5% parameters as overhead while providing a 15– $150\times $ speedup compared with R-V-W.

Highlights

T ODAY deep neural networks (DNNs) have achieved or even surpassed human-level performance in many fields, such as image recognition [1], natural language processing [2], and robotics
The knowledge distillation (KD)+random sparse adaptation (RSA) accuracy corresponds to the scenario where both KD-based variation-aware training (VAT) and RSA are performed
KD+RSA, with up to 5% of the parameters on the on-chip memory, outperforms all the previous approaches and achieves the state-of-the-art inference accuracy of 91.86% and 99.13% for CIFAR-10 and MNIST, respectively

Summary

Introduction

T ODAY deep neural networks (DNNs) have achieved or even surpassed human-level performance in many fields, such as image recognition [1], natural language processing [2], and robotics. GPUs, in general, consume high energy and computational resources in both training and inference operations [3], [5]. The main driving force of semiconductor design, CMOS technology scaling, is approaching the limit and becomes increasingly difficult to deliver the computation power that is needed for DNNs. In addition, conventional CMOS architecture faces the memory wall, i.e., von-Neumann bottleneck [7], further complicating the challenges to achieve high-performance and energy-efficient computing. Conventional CMOS architecture faces the memory wall, i.e., von-Neumann bottleneck [7], further complicating the challenges to achieve high-performance and energy-efficient computing In this context, there is an urgent need for hardware acceleration, exploring beyond-traditional CMOS technology, architectural, and algorithmic solutions [6]

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Journal on Exploratory Solid-State Computational Devices and Circuits	Publication Date: May 13, 2020
Citations: 51	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Accurate Inference With Inaccurate RRAM Devices: A Joint Algorithm-Design Solution

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Journal on Exploratory Solid-State Computational Devices and Circuits

Lead the way for us

Similar Papers

Accurate Inference with Inaccurate RRAM Devices: Statistical Data, Model Transfer, and On-line Adaptation
Gouranga Charan ... Gokul Krishnan
-
Gouranga Charan, et. al.Gouranga Charan ... Gokul Krishnan
01 Jul 2020
01 Jul 2020

Compensation architecture design utilizing residual resource to mitigate impacts of nonidealities in RRAM-based computing-in-memory chips
Xiaoqing Zhao ... Hongbin Sun
Microelectronics Journal | VOL. 149
Xiaoqing Zhao, et. al.Xiaoqing Zhao ... Hongbin Sun
19 Apr 2024
Microelectronics Journal | VOL. 149

3D Vertical RRAM Array and Device Co-design with Physics-based Spice Model
Weiiie Xu ... Xiaoyan Liu
-
Weiiie Xu, et. al.Weiiie Xu ... Xiaoyan Liu
01 Oct 2019
01 Oct 2019

Electrothermal Modeling and Simulation of Resistive Random Access Memory (RRAM) with Different Resistive Switching Oxides
Tan-Yi Li ... Sichao Du
-
Tan-Yi Li, et. al.Tan-Yi Li ... Sichao Du
01 Aug 2020
01 Aug 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Accurate Inference With Inaccurate RRAM Devices: A Joint Algorithm-Design Solution

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Journal on Exploratory Solid-State Computational Devices and Circuits