Abstract

In recent years, the current trend of Convolutional Neural Networks (CNNs) is toward lower computational cost to achieve lightweight. In lightweight convolutional neural networks, the depthwise separable convolution (DSC) is becoming the mainstream method. But in DSC, the pointwise convolution (PWC) with $1\times 1$ filters still has abundant parameters and computation. In this paper, an more efficient convolution algorithm is proposed to replace PWC, named kernel shared group convolution (KSGC). KSGC is used to combine channel information, which can be seen as the same convolution kernel sliding on the channel. In addition, Winograd algorithm is used to mitigate the number of multiplications required by KSGC in this paper. A CNN accelerator using a novel processing element (PE) performs 1-D Winograd in KSGC was implemented on a Ultra96-V2 field-programmable gate array (FPGA).At 200MHz clock frequency, the accelerator achieved computational performance of 52. 7GOPS and performance-power ratio of 10.42GOPS/W.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call