Abstract

In recent years, convolutional neural networks (CNNs) have become widely adopted for computer vision tasks. FPGAs have been adequately explored as a promising hardware accelerator for CNNs owing to their high performance, energy efficiency, and reconfigurability. However, previous FPGA methods, which are based on the conventional convolutional algorithm, are often bounded by the computational capability of FPGAs. This paper first introduces four convolution algorithms: 6-loop algorithm, general matrix-matrix multiplication (GEMM), Winograd algorithm, and fast Fourier transform (FFT) algorithm. Then, we present the implementations of these algorithms at home and abroad, and also introduce their corresponding optimization techniques.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call