WRA-MF: A Bit-Level Convolutional-Weight-Decomposition Approach to Improve Parallel Computing Efficiency for Winograd-Based CNN Acceleration

Siwei Xiang,Xianxian Lv,Cimang Lu,Jianfei Wang,Chen Yang,Yishuo Meng

doi:10.3390/electronics12244943

Siwei Xiang, Xianxian Lv + Show 4 more

Open Access

https://doi.org/10.3390/electronics12244943

Copy DOI

Journal: Electronics	Publication Date: Dec 8, 2023
License type: CC BY 4.0

Affiliation: Xi'an Jiaotong University

Abstract

FPGA-based convolutional neural network (CNN) accelerators have been extensively studied recently. To exploit the parallelism of multiplier–accumulator computation in convolution, most FPGA-based CNN accelerators heavily depend on the number of on-chip DSP blocks in the FPGA. Consequently, the performance of the accelerators is restricted by the limitation of the DSPs, leading to an imbalance in the utilization of other FPGA resources. This work proposes a multiplication-free convolutional acceleration scheme (named WRA-MF) to relax the pressure on the required DSP resources. Firstly, the proposed WRA-MF employs the Winograd algorithm to reduce the computational density, and it then performs bit-level convolutional weight decomposition to eliminate the multiplication operations. Furthermore, by extracting common factors, the complexity of the addition operations is reduced. Experimental results on the Xilinx XCVU9P platform show that the WRA-MF can achieve 7559 GOP/s throughput at a 509 MHz clock frequency for VGG16. Compared with state-of-the-art works, the WRA-MF achieves up to a 3.47×–27.55× area efficiency improvement. The results indicate that the proposed architecture achieves a high area efficiency while ameliorating the imbalance in the resource utilization.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

WRA-MF: A Bit-Level Convolutional-Weight-Decomposition Approach to Improve Parallel Computing Efficiency for Winograd-Based CNN Acceleration

Abstract

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

An Uninterrupted Processing Technique-Based High-Throughput and Energy-Efficient Hardware Accelerator for Convolutional Neural Networks
Md Najrul Islam ... Rahul Shrestha
IEEE Transactions on Very Large Scale Integration (VLSI) Systems | VOL. 30
Md Najrul Islam, et. al.Md Najrul Islam ... Rahul Shrestha
01 Dec 2022
IEEE Transactions on Very Large Scale Integration (VLSI) Systems | VOL. 30

Improving the Performance of CNN Accelerator Architecture under the Impact of Process Variations
Jingweijia Tan ... Weiren Wang
ACM Transactions on Design Automation of Electronic Systems | VOL. 28
Jingweijia Tan, et. al.Jingweijia Tan ... Weiren Wang
09 Sep 2023
ACM Transactions on Design Automation of Electronic Systems | VOL. 28

Process Variation Mitigation on Convolutional Neural Network Accelerator Architecture
Maodi Ma ... Xiaohui Wei
-
Maodi Ma, et. al.Maodi Ma ... Xiaohui Wei
01 Nov 2019
01 Nov 2019

AdaPrune: An Accelerator-Aware Pruning Technique for Sustainable CNN Accelerators
Jiajun Li ... Ahmed Louri
IEEE Transactions on Sustainable Computing | VOL. 7
Jiajun Li, et. al.Jiajun Li ... Ahmed Louri
01 Jan 2021
IEEE Transactions on Sustainable Computing | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

WRA-MF: A Bit-Level Convolutional-Weight-Decomposition Approach to Improve Parallel Computing Efficiency for Winograd-Based CNN Acceleration

Abstract

Talk to us

Similar Papers

More From: Electronics