Accelerating Neural Network Inference on FPGA-Based Platforms—A Survey

Ran Wu,Xinmin Guo,Junbao Li,Jian Du

doi:10.3390/electronics10091025

Abstract

The breakthrough of deep learning has started a technological revolution in various areas such as object identification, image/video recognition and semantic segmentation. Neural network, which is one of representative applications of deep learning, has been widely used and developed many efficient models. However, the edge implementation of neural network inference is restricted because of conflicts between the high computation and storage complexity and resource-limited hardware platforms in applications scenarios. In this paper, we research neural networks which are involved in the acceleration on FPGA-based platforms. The architecture of networks and characteristics of FPGA are analyzed, compared and summarized, as well as their influence on acceleration tasks. Based on the analysis, we generalize the acceleration strategies into five aspects—computing complexity, computing parallelism, data reuse, pruning and quantization. Then previous works on neural network acceleration are introduced following these topics. We summarize how to design a technical route for practical applications based on these strategies. Challenges in the path are discussed to provide guidance for future work.

Full Text