A tile-fusion method for accelerating Winograd convolutions

Zeyu Ji,Xingjun Zhang,Zheng Wei,Jingbo Li,Jia Wei

doi:10.1016/j.neucom.2021.06.003

Abstract

Compared with fast convolution methods such as im2col and the fast Fourier transform, Winograd-based convolution, which has been widely applied to accelerate convolutional neural networks (CNNs), can provide high performance with smaller filters. Although there are several reported studies on the algorithmic optimization of CNNs, most of them are targeted at hardware architectures. The existing implementations of the Winograd method perform well below what one would expect, due to the fact that the tile size of Winograd-based convolution is usually empirical and the features of each convolution layer are ignored. This study aims to fill this gap and focuses on the efficient implementation of Winograd-based convolution in the CNN model. Specifically, we discuss the causes of poor performance, calculate the coefficient of computation complexity model and demonstrate a speedup in the inference process using an elaborate tile-fusion method, which derives the optimal tile size for each convolution layer in a CNN model. Compared with the representative existing implementations of CuDNN with a 4 × 4 tile, Arm Compute Library with a 6 × 6 tile, and NNPACK with an 8 × 8 tile, the results show significant performance improvements on of up to 1.89 × , 1.29 × and 1.17 × , respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A tile-fusion method for accelerating Winograd convolutions

Abstract

Talk to us

Similar Papers

More From: Neurocomputing

Lead the way for us

Journal: Neurocomputing	Publication Date: Jun 5, 2021
Citations: 2

Similar Papers

Prediction of Diabetic Retinopathy using Deep Learning with Preprocessing
S Balaji ... D Gokulakrishnan
EAI Endorsed Transactions on Pervasive Health and Technology | VOL. 10
S Balaji, et. al.S Balaji ... D Gokulakrishnan
22 Feb 2024
EAI Endorsed Transactions on Pervasive Health and Technology | VOL. 10

Helix Matrix Transformation Combined With Convolutional Neural Network Algorithm for Matrix-Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry-Based Bacterial Identification.
Jin Ling ... Hongrui Yin
Frontiers in microbiology | VOL. 11
Jin Ling, et. al.Jin Ling ... Hongrui Yin
12 Nov 2020
Frontiers in microbiology | VOL. 11

Segmentation of lung parenchyma in CT images using CNN trained with the clustering algorithm generated dataset
Mingjie Xu ... Yong Yue
BioMedical Engineering OnLine | VOL. 18
Mingjie Xu, et. al.Mingjie Xu ... Yong Yue
03 Jan 2019
BioMedical Engineering OnLine | VOL. 18

Artificial intelligence: finding the intersection of predictive modeling and clinical utility
Karthik Ravi
Gastrointestinal Endoscopy | VOL. 93
Karthik RaviKarthik Ravi
07 Mar 2021
Gastrointestinal Endoscopy | VOL. 93

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A tile-fusion method for accelerating Winograd convolutions

Abstract

Talk to us

Similar Papers

More From: Neurocomputing