Training of Mixed-Signal Optical Convolutional Neural Networks With Reduced Quantization Levels

Zheyuan Zhu,Joseph Ulseth,Shuo Pang,Guifang Li

doi:10.1109/access.2021.3072193

Zheyuan Zhu, Joseph Ulseth + Show 2 more

Open Access

https://doi.org/10.1109/access.2021.3072193

Copy DOI

Abstract

Mixed-signal artificial neural networks (ANNs) that employ analog matrix-multiplication accelerators can achieve higher speed and improved power efficiency. Though analog computing is known to be susceptible to noise and device imperfections, various analog computing paradigms have been considered as promising solutions to address the growing computing demand in machine learning applications, thanks to the robustness of ANNs. This robustness has been explored in low-precision, fixed-point ANN models, which have proven successful on compressing ANN model size on digital computers. However, these promising results and network training algorithms cannot be easily migrated to analog accelerators. The reason is that digital computers typically carry intermediate results with higher bit width, though the inputs and weights of each ANN layers are of low bit width; while the analog intermediate results have low precision, analogous to digital signals with a reduced quantization level. Here we report a training method for mixed-signal ANN with two types of errors in its analog signals, random noise, and deterministic errors (distortions). The results showed that mixed-signal ANNs trained with our proposed method can achieve an equivalent classification accuracy with noise level up to 50% of the ideal quantization step size. We have demonstrated this training method on a mixed-signal optical convolutional neural network based on diffractive optics.

Highlights

Artificial neural networks (ANN) are growing larger and deeper [1]–[3] to tackle tasks of increasing complexity [4]–[6]
A model trained by such a method is likely to have an inferior inference performance [22], as analog intermediate results cannot match the full precision of those obtained with a digital computer
The mixed-signal convolutional neural network (MCNN) trained with our method maintained its accuracy at 2 bits (4 levels), indicating that the mixed-signal neural network can operate at a reduced quantization level of the intermediate results

Summary

INTRODUCTION

Artificial neural networks (ANN) are growing larger and deeper [1]–[3] to tackle tasks of increasing complexity [4]–[6]. Ex-situ training has been deployed on a simulated analog unit using a fixed-point data format [21], which is analogous to a low-bit-width neural network using a digital computer. A model trained by such a method is likely to have an inferior inference performance [22], as analog intermediate results cannot match the full precision of those obtained with a digital computer. Low-precision neural networks perform matrix multiplications and convolutions between fixed-point inputs and weights, which are the required data format for many digital tensor processing units [24]. Where L is the total bit width, m is the shared exponent among all tensor elements, and [·] denotes rounding to the nearest integer Several nonlinearities, such as clipping and scaling functions, have been purposefully designed for easier integration with the quantization operation [25], [26]

MIXED-SIGNAL ANN LAYER WITH AN ANALOG ACCELERATOR

Findings

MCNN EXPERIMENT