A backpropagation with gradient accumulation algorithm capable of tolerating memristor non-idealities for training memristive neural networks

Shuai Dong,Yihong Chen,Zhen Fan,Kaihui Chen,Minghui Qin,Min Zeng,Xubing Lu,Guofu Zhou,Xingsen Gao,Jun-Ming Liu

doi:10.1016/j.neucom.2022.04.008

Abstract

Memristive neural network (MNN) has emerged as a new computing architecture with high speed and low power consumption, but its hardware implementation is hampered mainly by the device non-idealities of memristors. Here, we propose a backpropagation with gradient accumulation (BP-GA) algorithm which can effectively tolerate the memristor non-idealities. We first show that a memristor-based single-layer perceptron trained with BP-GA achieves high image recognition accuracies (>85% on the MNIST dataset) with a wide range of learning rates (0.01–50), large nonlinearities (>6), a small number of conductance states (even down to 3 states), a broad spectrum of ON/OFF ratios (across 3 orders of magnitude), and large noises (up to 40%). The origin for such good robustness against the learning rate and memristor non-idealities is then investigated and revealed to be associated with the operation mechanism of BP-GA. In this algorithm, the weights can keep increasing in magnitude due to the gradient accumulation and therefore become movable even at a small learning rate (or a large ON/OFF ratio equivalently). Moreover, the weights corresponding to the foreground and background of the input image are appropriately moved to the positive and negative boundaries of the weight range, respectively, thus allowing the learning of the image features using only a small number of conductance states. These boundary weights are trapped there without significant oscillation under the effects of accumulated gradients, resulting in immunity to large nonlinearity and noise. Therefore, BP-GA enables the MNN to be insensitive to the learning rate and robust against the memristor non-idealities. The performance of BP-GA is further evaluated on the memristor-based multilayer perceptron (on the MNIST dataset) and convolutional neural network (on the Cifar-10 dataset). For both MNNs, BP-GA exhibits relatively high accuracies and good tolerance against memristor non-idealities, demonstrating its applicability to complex networks and problems. This study provides a viable approach at the algorithm level for addressing some important hardware implementation issues of MNNs using realistic memristors.

Full Text