This paper presents a ReRAM-based convolutional neural network (CNN) accelerator with a new analog layer normalization (ALN) technique. The proposed ALN can be used to effectively reduce the effect of the conductance variation is ReRAM devices by normalizing the outputs of the vector-matrix multiplication (VMM) in the charge domain. The ALN achieves high energy and hardware efficiencies because it directly processes normalization of the VMM outputs without storing their values in memory and is merged into the neuron circuit of the accelerator. To verify the effect of the ALN through experiments, a VMM accelerator that consists of two 25 × 25 sized ReRAM arrays and peripheral circuits with ALN is used for a convolution layer with digital signal processing (DSP) in a field programmable gate array (FPGA). The MNIST dataset is used to train and inference a CNN employing two VMM accelerators that work as convolution layers in a pipelined manner. Despite the conductance variation of the ReRAM devices, the ALN successfully stabilizes the output distribution of the convolution layer, which improves the classification accuracy of the network. A final classification accuracy for the MNIST and Fashion-MNIST datasets of 96.2% and 83.1% is achieved, respectively, with an energy efficiency of 9.94 tera-operations per second per Watt (TOPS/W).
Read full abstract