A FPGA-based end-to-end acceleration framework for fast deployment of Convolutional Neural Networks

Lin Zhang,Binfeng Liu,Bing Li

doi:10.1088/1742-6596/1780/1/012022

Lin Zhang, Binfeng Liu + Show 1 more

Open Access

https://doi.org/10.1088/1742-6596/1780/1/012022

Copy DOI

Journal: Journal of Physics: Conference Series	Publication Date: Feb 1, 2021
Citations: 3	License type: cc-by

Affiliation: Southeast University

Abstract

Nowadays, CNNs has delivered the state-of-the-art performance in the field of computer vision, image classification, etc. As CNNs going deeper, it becomes more difficult to implement CNNs applications based on general-purpose computing platforms. Recently, many FPGA-based CNNs accelerators have been proposed, these accelerators achieved high performance on specific CNNs models, however they are somewhat lack of reconfigurability to fit different applications. To deal with this problem, an end-to-end acceleration framework was proposed in this paper, which consists of a parameterized hardware accelerator and a fully automatic software framework. Parallel computation and pipeline optimization are deployed in the hardware design to achieve high performance. Simultaneously, runtime reconfigurability is implemented by using a global register list. By encapsulating the underlying driver, a three-layer software framework is provided for users to deploy their pre-trained models. A typical CNNs model used for handwritten digital recognition was selected to test and verify the accelerator. The experimental result shows that the accelerator can reach a recognition speed of 22.65FPS under the clock frequency of 100MHz, comparing with ARM Cortex-A9 working at 650MHz, it can achieve 25.9 times of acceleration effect, with only 1.59W power consumption.

Full Text