Fast Convolutional Neural Networks in Low Density FPGAs Using Zero-Skipping and Weight Pruning

Mário P Véstias,Rui Policarpo Duarte,Horácio C Neto,José T De Sousa

doi:10.3390/electronics8111321

Abstract

Edge devices are becoming smarter with the integration of machine learning methods, such as deep learning, and are therefore used in many application domains where decisions have to be made without human intervention. Deep learning and, in particular, convolutional neural networks (CNN) are more efficient than previous algorithms for several computer vision applications such as security and surveillance, where image and video analysis are required. This better efficiency comes with a cost of high computation and memory requirements. Hence, running CNNs in embedded computing devices is a challenge for both algorithm and hardware designers. New processing devices, dedicated system architectures and optimization of the networks have been researched to deal with these computation requirements. In this paper, we improve the inference execution times of CNNs in low density FPGAs (Field-Programmable Gate Arrays) using fixed-point arithmetic, zero-skipping and weight pruning. The developed architecture supports the execution of large CNNs in FPGA devices with reduced on-chip memory and computing resources. With the proposed architecture, it is possible to infer an image in AlexNet in 2.9 ms in a ZYNQ7020 and 1.0 ms in a ZYNQ7045 with less than 1% accuracy degradation. These results improve previous state-of-the-art architectures for CNN inference.

Highlights

Artificial intelligence associated with image classification [1] is largely used in computer vision applications improving computer vision tasks, such as image classification, object detection, and image segmentation
This paper proposes a very efficient architecture that considers zero-skipping, dynamic pruning, block pruning, fixed-point representations and image batch to be implemented in low density FPGAs for smart embedded systems
The architectural optimizations proposed in this paper are applied to a baseline architecture that implements large convolutional neural network (CNN) in low density FPGAs considering only 8-bit fixed-point representation format following the ideas of omi [7]

Summary

Introduction

Artificial intelligence associated with image classification [1] is largely used in computer vision applications improving computer vision tasks, such as image classification, object detection, and image segmentation. Several other CNNs were proposed in the last years, some regular and some irregular with layers different from the usual convolutional and fully connected layers Running any of these networks in an embedded system with strict performance, memory and energy constraints is a challenge because of the high number of weights and operations. This paper improves the baseline architecture with the following techniques: Zero skipping in the convolutional layers where multiplication with zero valued activations are skipped; Dynamic zeroing of activations in convolutional layers; and Coarse pruning of fully connected layers where blocks of redundant weights are cut reducing the memory size required to store them and the number of operations.

Related Work

Convolutional Neural Networks

Baseline Architecture for CNN Inference

PE Clusters

Feature Map Memory

Result

Zero-Skipping and Dynamic Pruning of Activations

Pruning of Weights in Fully Connected Layers

Designing with the Proposed Architecture for Best Performance

Performance Model

Area Model

Model Based Design

Results

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronics	Publication Date: Nov 9, 2019
Citations: 14	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Fast Convolutional Neural Networks in Low Density FPGAs Using Zero-Skipping and Weight Pruning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

A fast and scalable architecture to run convolutional neural networks in low density FPGAs
Mário P Véstias ... Horácio C Neto
Microprocessors and Microsystems | VOL. 77
Mário P Véstias, et. al.Mário P Véstias ... Horácio C Neto
21 May 2020
Microprocessors and Microsystems | VOL. 77

Artificial Intelligence for Computer Vision in Surgery: A Call for Developing Reporting Guidelines.
Daichi Kitaguchi ... Nobuyoshi Takeshita
Annals of Surgery | VOL. 275
Daichi Kitaguchi, et. al.Daichi Kitaguchi ... Nobuyoshi Takeshita
23 Nov 2021
Annals of Surgery | VOL. 275

Faster Convolutional Neural Networks in Low Density FPGAs Using Block Pruning
Tiago Peres ... Mário Véstias
-
Tiago Peres, et. al.Tiago Peres ... Mário Véstias
01 Jan 2019
01 Jan 2019

Clinically Relevant Vulnerabilities of Deep Machine Learning Systems for Skin Cancer Diagnosis
Xinyi Du-Harpur ... Magnus D Lynch
Journal of Investigative Dermatology | VOL. 141
Xinyi Du-Harpur, et. al.Xinyi Du-Harpur ... Magnus D Lynch
12 Sep 2020
Journal of Investigative Dermatology | VOL. 141

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Fast Convolutional Neural Networks in Low Density FPGAs Using Zero-Skipping and Weight Pruning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics