FPGA based Flexible Implementation of Light Weight Inference on Deep Convolutional Neural Networks

Shefa Dawwd

doi:10.34028/iajit/21/3/5

Abstract

Standard Convolution (StdConv) is the main technique used in the state of the art Deep Convolutional Neural Networks (DCNNs). Fewer computations are achieved if Depthwise Separable Convolution technique (SepConv) is used as an alternative. A crucial issue in many applications like smart cameras and autonomous vehicles where low latency is essential stems from deploying a lightweight and low cost inference models. An acceptable accuracy should be kept with tolerable computations and memory access load. A flexible architecture for different DCNN convolution types and models is proposed. The flexibility comes from the sharing of one memory access unit with different types of layers regardless of the selected kernel size, by multiplying each weight vector by local operators with variant aperture. Moreover, one depthwise computation unit can be used for both standard and pointwise layers. The learnable parameters are quantized to 8-bits fixed point representation and that gives very limited reduction of accuracy and a considerable reduction of the Field-Programmable Gate Array (FPGA) resources. To reduce processing time, inter layer parallel computations are performed. The experiment is conducted by using grey scale ORL database with shallow Convolutional Neural Network (CNN) and the colored Canadian Institute for Advanced Research 10 classes (CIFAR-10) database with DCNN, and a comparable accuracies of 93% and 85.7% are achieved respectively using very low cost of Spartan 3E and moderate cost of zynq FPGA platforms

Full Text