PACA: A Pattern Pruning Algorithm and Channel-Fused High PE Utilization Accelerator for CNNs

Jingyu Wang,Xueqing Li,Songming Yu,Zhuqing Yuan,Yanzhi Wang,Yongpan Liu,Jinshan Yue,Huazhong Yang,Ruoyang Liu,Zhe Yuan

doi:10.1109/tcad.2022.3140730

Abstract

In recent years, convolutional neural networks (CNNs) have achieved significant advancements in various fields. However, the computation and storage overheads of CNNs are overwhelming for Internet-of-Things devices. Both network pruning algorithms and hardware accelerators have been introduced to empower CNN inference at the edge. Network pruning algorithms reduce the size and computational cost of CNNs by regularizing unimportant weights to zeros. However, existing works lack intrakernel structured types to tradeoff between sparsity and hardware efficiency, and the index storage for irregularly pruned networks is significant. Hardware accelerators leverage the sparsity of pruned CNNs to improve energy efficiency. However, their process element (PE) utilization rate is low because of uneven sparsity among input convolutional kernels. To overcome these problems, we propose PACA: a Pattern pruning Algorithm and Channel-fused high PE utilization Accelerator for CNNs. It includes three parts: a pattern pruning algorithm to explore the intrakernel sparsity type and reduce the index storage, a channel-fused hardware architecture to reduce the PEs’ idle rate and improve the performance, and a heuristic and taboo search-based smart fusion scheduler to analyze the idle PE problem and schedule the channel fusion in hardware. To demonstrate the effectiveness of PACA, we have implemented the software parts by Python and the hardware architecture by RTL codes. Experimental results on various datasets show that compared with an existing work, PACA can reduce the index storage overhead by <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$3.47\times $ </tex-math></inline-formula> – <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$5.63\times $ </tex-math></inline-formula> with 3.85–9.12 average patterns, and it can improve the hardware performance by <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$2.01\times $ </tex-math></inline-formula> – <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$5.53\times $ </tex-math></inline-formula> because of PEs’ idle rate reduction.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

PACA: A Pattern Pruning Algorithm and Channel-Fused High PE Utilization Accelerator for CNNs

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Lead the way for us

Journal: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems	Publication Date: Nov 1, 2022
Citations: 4

Similar Papers

OMNI: A Framework for Integrating Hardware and Software Optimizations for Sparse CNNs
Yun Liang ... Jiaming Xie
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | VOL. 40
Yun Liang, et. al.Yun Liang ... Jiaming Xie
14 Sep 2020
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | VOL. 40

Convolutional Neural Networks using FPGA-based Pipelining
Gheni A Ali ... Ahmed Hussein Ali
Iraqi Journal for Computer Science and Mathematics | VOL. -
Gheni A Ali, et. al.Gheni A Ali ... Ahmed Hussein Ali
23 May 2023
Iraqi Journal for Computer Science and Mathematics | VOL. -

TFE: Energy-efficient Transferred Filter-based Engine to Compress and Accelerate Convolutional Neural Networks
Huiyu Mo ... Leibo Liu
-
Huiyu Mo, et. al.Huiyu Mo ... Leibo Liu
01 Oct 2020
01 Oct 2020

Research on Medical Data Feature Extraction and Intelligent Recognition Technology Based on Convolutional Neural Network
Weidong Liu ... Kun Gao
IEEE Access | VOL. 7
Weidong Liu, et. al.Weidong Liu ... Kun Gao
01 Jan 2019
IEEE Access | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

PACA: A Pattern Pruning Algorithm and Channel-Fused High PE Utilization Accelerator for CNNs

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems