Fast computation of deep neural network and its real‐time implementation for image recognition

Shih‐Chang Hsia,Szu‐Hong Wang,Feng‐Yang Kuo

doi:10.1111/coin.12481

Abstract

AbstractThe convolution is widely used for deep neural networks to extract the key features, which requires many additions and multiplications. In this study, the fast computational algorithm is presented to reduce the number of arithmetic when the accuracy is kept. The order of deep convolution is alternative to save the computational operators. To verify the performance, the proposed algorithm is embedded to the typical deep neural network VggNet. The structure of VggNet is further modified using the proposed summation and concatenation techniques to improve the computational accuracy and to reduce the processing time. Compared with the original VggNet, the simulations show that the operational FLOPs can be greatly reduced at least 50% with various datasets testing. Besides, the training time with epoch per batch can save about 10%–20%. The proposed fast algorithm can lessen the parameters and the mode size over 90%. The recognition accuracy can be improved with 1%–4% from various datasets testing. Based on the fast network, real‐time FPGA had been realized, which the hardware performance can achieve 371 GOPs with 642 DSP cores. The processing speed can achieve near to 1 k frames per second, and the real‐time recognition rate can achieve over 90%.

Full Text