Towards a component-based acceleration of convolutional neural networks on FPGAs

Danielle Tchuinkou Kwadjo,Erman Nghonda Tchinda,Joel Mandebi Mbongue,Christophe Bobda

doi:10.1016/j.jpdc.2022.04.025

Danielle Tchuinkou Kwadjo, Erman Nghonda Tchinda + Show 2 more

Open Access

https://doi.org/10.1016/j.jpdc.2022.04.025

Copy DOI

Journal: Journal of Parallel and Distributed Computing	Publication Date: May 6, 2022
Citations: 4	License type: publisher-specific-oa

Affiliation: University of Florida

Abstract

In recent years, Convolution Neural Networks (CNN) have been extensively adopted in broad Artificial Intelligence (AI) applications and have demonstrated ability and effectiveness in solving learning problems. However, developing high-performance hardware accelerators on Field Programmable Gate Array (FPGA) for CNNs often demands skills in hardware design and verification, accurate distribution localization, and long development cycles. Besides, the depth of CNN architectures increases by reusing and replicating several layers. In this work, we take advantage of the replication of CNN layers to achieve improvement in design performance and productivity. We propose a programming flow for CNNs on FPGA to generate high-performance accelerators by assembling CNN pre-implemented components as a puzzle based on the graph topology. Using pre-implemented components allows us to use minimum of resources, predict the performance, and gain in productivity since there is no need to synthesize any Hardware Description Language (HDL) source code. Furthermore, the pre-implemented components are reused for different range of applications, reducing the engineering time. Through prototyping, we demonstrate the viability and relevance of our approach. Experiments show a productivity improvement of up to 69% compared to a traditional FPGA implementation while achieving over 1.75× higher Fmax with lower resources and higher energy efficiency.

Full Text