FPGA accelerator for CNN: an exploration of the kernel structured sparsity and hybrid arithmetic computation

Guanwen Zhang,Zhemin Duan,Song Zhou,Wei Zhou

doi:10.1117/1.jei.30.3.033034

Abstract

The deployment of large-scale deep neural networks on field programmable gate array (FPGA) platforms is severely hindered by the high requirements on computational resources and off-chip data bandwidth. Traditional nonstructured sparsity algorithms can efficiently reduce the nonzero weights of neural network models. However, the nonstructured sparse connections across channels also degrade the degree of computational parallelism and consequently seriously deteriorate the performance of the FPGA accelerator. We propose an FPGA accelerator by exploring the kernel structured sparsity and hybrid arithmetic computation for the convolutional neural network (CNN). On the one hand, we introduce a hardware-friendly kernel pruning method to reduce the number of arithmetic operations of the CNN model. Our proposed method maintains high accuracy (achieving a less than 0.32% accuracy loss) and achieves a high degree of parallelism. On the other hand, we design a specific hybrid arithmetic computation for the FPGA accelerator to speed up the performance of the pruned CNN model. The FPGA accelerator consists of only 64 sets of hybrid 8-bit and 16-bit floating-point units for the convolution operation. Experiments on VGGNet16 demonstrate that the proposed FPGA accelerator achieves a state-of-the-art 5 × convolution operation reduction and a 3 × parameter compression. The proposed FPGA accelerator is able to perform at 13.2 FPS, and the corresponding energy efficiency can be boosted up to 1.9 image / J.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

FPGA accelerator for CNN: an exploration of the kernel structured sparsity and hybrid arithmetic computation

Abstract

Talk to us

Similar Papers

More From: Journal of Electronic Imaging

Lead the way for us

Journal: Journal of Electronic Imaging	Publication Date: Jun 28, 2021
Citations: 1

Similar Papers

Investigating the efficiency and performance gains of FPGA-accelerated Convolutional Neural Networks
Shouyi Li
Applied and Computational Engineering | VOL. 40
Shouyi LiShouyi Li
21 Feb 2024
Applied and Computational Engineering | VOL. 40

High Power-Efficient and Performance-Density FPGA Accelerator for CNN-Based Object Detection
Gang Zhang ... Yong Liu
-
Gang Zhang, et. al.Gang Zhang ... Yong Liu
01 Jan 2020
01 Jan 2020

FPNet: Customized Convolutional Neural Network for FPGA Platforms
Yang Yang ... Xuehai Zhou
-
Yang Yang, et. al.Yang Yang ... Xuehai Zhou
01 Dec 2019
01 Dec 2019

Codesign-NAS
Mohamed S Abdelfattah ... Nicholas D Lane
-
Mohamed S Abdelfattah, et. al.Mohamed S Abdelfattah ... Nicholas D Lane
23 Feb 2020
23 Feb 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

FPGA accelerator for CNN: an exploration of the kernel structured sparsity and hybrid arithmetic computation

Abstract

Talk to us

Similar Papers

More From: Journal of Electronic Imaging