A Parametrizable High-Level Synthesis Library for Accelerating Neural Networks on FPGAs

Lester Kalms,Pedram Amini Rad,Arsany Iskander,Muhammad Ali,Diana Göhringer

doi:10.1007/s11265-021-01651-5

Lester Kalms, Pedram Amini Rad + Show 3 more

Open Access

https://doi.org/10.1007/s11265-021-01651-5

Copy DOI

Abstract

In recent years, Convolutional Neural Network CNN have been incorporated in a large number of applications, including multimedia retrieval and image classification. However, CNN based algorithms are computationally and resource intensive and therefore difficult to be used in embedded systems. FPGA based accelerators are becoming more and more popular in research and industry due to their flexibility and energy efficiency. However, the available resources and the size of the on-chip memory can limit the performance of the FPGA accelerator for CNN. This work proposes an High-Level Synthesis HLS library for CNN algorithms. It contains seven different streaming-capable CNN (plus two conversion) functions for creating large neural networks with deep pipelines. The different functions have many parameter settings (e.g. for resolution, feature maps, data types, kernel size, parallelilization, accuracy, etc.), which also enable compile-time optimizations. Our functions are integrated into the HiFlipVX library, which is an open source HLS FPGA library for image processing and object detection. This offers the possibility to implement different types of computer vision applications with one library. Due to the various configuration and parallelization possibilities of the library functions, it is possible to implement a high-performance, scalable and resource-efficient system, as our evaluation of the MobileNets algorithm shows.

Highlights

Nowadays neural network applications are widely used in new technologies such as artificial intelligence and robotics [23]
We have researched and implemented different possibilities of parallelization in order to achieve a high performance with an efficient use of resources, which we show in this paper using our implementation of the MobileNets algorithm [9]
Our library is integrated into the HiFlipVX library, which is an open source High-Level Synthesis (HLS) FPGA library for image processing [17] and object detection [15]

Summary

Introduction

Nowadays neural network applications are widely used in new technologies such as artificial intelligence and robotics [23]. Creating streaming applications with multiple nodes or layers gives FPGAs the ability to achieve higher performance and power efficiency for computer vision algorithms compared to other architectures, such as CPUs and GPUs, as Kalms et al [14] or Qasaimeh et al [24] show. Our library is integrated into the HiFlipVX library, which is an open source HLS FPGA library for image processing [17] and object detection [15] This offers the possibility to design and implement different kinds of computer vision applications with one library. Most functions of the libraries are based on the OpenVX standard This simplifies the design of applications on heterogeneous systems containing different types of architectures (e.g. CPU, GPU and FPGA), due to the different existing implementations from different vendors.

Related Work

Implementation

The HiFlipVX Library

Neural Network Layers

Depthwise Convolution

Pooling

Batch Normalization

Fully Connected

Softmax

MobileNets Architecture

High-Level Synthesis Directive Usage

Single Layers

MobileNets

F M OF M vdw vif m vof m vpw dwconv dwbn pwconv pwbn max

Conclusion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Signal Processing Systems	Publication Date: Mar 15, 2021
Citations: 9	License type: open-access

R Discovery Prime

R Discovery Prime

A Parametrizable High-Level Synthesis Library for Accelerating Neural Networks on FPGAs

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Signal Processing Systems

Lead the way for us

Similar Papers

Model distillation for high-level semantic understanding：a survey
Sun Ruoyu ... Xiong Hongkai
Journal of Image and Graphics | VOL. 28
Sun Ruoyu, et. al.Sun Ruoyu ... Xiong Hongkai
01 Jan 2023
Journal of Image and Graphics | VOL. 28

Real Full Binary Neural Network for Image Classification and Object Detection
Youngbin Kim ... Wonjun Hwang
-
Youngbin Kim, et. al.Youngbin Kim ... Wonjun Hwang
01 Jan 2020
01 Jan 2020

SliceSamp: A Promising Downsampling Alternative for Retaining Information in a Neural Network
Lianlian He ... Ming Wang
Applied sciences | VOL. 13
Lianlian He, et. al.Lianlian He ... Ming Wang
25 Oct 2023
Applied sciences | VOL. 13

Global Attention Augmentation Ghost Module: More Features from Lightweight Global Attention Extraction
Hongru Liu ... Yuanyuan Guan
-
Hongru Liu, et. al.Hongru Liu ... Yuanyuan Guan
01 Nov 2021
01 Nov 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Parametrizable High-Level Synthesis Library for Accelerating Neural Networks on FPGAs

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Signal Processing Systems