Fast convolutional neural networks on FPGAs with hls4ml

Thea Klaeboe Aarrestad ,Nicolò Ghielmetti,Duc Hoang,S Jindariani ,J Duarte ,M Pierini ,D Rankin ,S Summers ,J Ngadiuba ,P Harris ,Edward Kreinar,Mia Liu,N V Tran ,Vladimir Lončar ,Y Iiyama ,Giuseppe Di Guglielmo,Z Wu ,Christoffer Petersson,Hampus Linander,K Pedro

doi:10.1088/2632-2153/ac0ea1

Abstract

We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with convolutional layers on field-programmable gate arrays (FPGAs). By extending the hls4ml library, we demonstrate an inference latency of 5 µs using convolutional architectures, targeting microsecond latency applications like those at the CERN Large Hadron Collider. Considering benchmark models trained on the Street View House Numbers Dataset, we demonstrate various methods for model compression in order to fit the computational constraints of a typical FPGA device used in trigger and data acquisition systems of particle detectors. In particular, we discuss pruning and quantization-aware training, and demonstrate how resource utilization can be significantly reduced with little to no loss in model accuracy. We show that the FPGA critical resource consumption can be reduced by 97% with zero loss in model accuracy, and by 99% when tolerating a 6% accuracy degradation.

Highlights

The hls4ml library [1, 2] is an open source software designed to facilitate the deployment of machine learning (ML) models on field-programmable gate arrays (FPGAs), targeting low-latency and low-power edge applications
The development of hls4ml was historically driven by the need to integrate ML algorithms in the first stage of the real-time data processing of particle physics experiments operating at the CERN Large Hadron Collider (LHC)
We have presented the extension of hls4ml to support convolutional neural networks (CNNs) architectures for transpilation to FPGA designs, through a stream-based implementation of convolutional and pooling layers

Summary

24 April 2021

Thea Aarrestad1,∗ , Vladimir Loncar, Nicolo Ghielmetti, Maurizio Pierini, Sioni Summers, Jennifer Ngadiuba , Christoffer Petersson, Hampus Linander, Yutaro Iiyama, Giuseppe Di Guglielmo, Javier Duarte , Philip Harris , Dylan Rankin, Sergo Jindariani, Kevin Pedro, Nhan Tran, Mia Liu, Edward Kreinar, Zhenbin Wu11 and Duc Hoang

Introduction

Related work

Convolutional layers implementation in hls4ml

Dataset

Baseline model

Compression by pruning

Compression by quantization

FPGA porting

Conclusions

Findings

Code availability statement

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Machine Learning: Science and Technology	Publication Date: Jul 16, 2021
Citations: 54	License type: cc-by

R Discovery Prime

R Discovery Prime

Fast convolutional neural networks on FPGAs with hls4ml

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Machine Learning: Science and Technology

Lead the way for us

Similar Papers

A Winograd-Based CNN Accelerator with a Fine-Grained Regular Sparsity Pattern
Tao Yang ... Li Jiang
-
Tao Yang, et. al.Tao Yang ... Li Jiang
01 Aug 2020
01 Aug 2020

Fast Convolutional Neural Networks in Low Density FPGAs Using Zero-Skipping and Weight Pruning
Mário P Véstias ... José T De Sousa
Electronics | VOL. 8
Mário P Véstias, et. al.Mário P Véstias ... José T De Sousa
09 Nov 2019
Electronics | VOL. 8

Training Machine Learning on JPEG Compressed Images
Maxime Pistono ... Gouenou Coatrieux
-
Maxime Pistono, et. al.Maxime Pistono ... Gouenou Coatrieux
01 Mar 2020
01 Mar 2020

On the role of spatial resolution on snow estimates using a process‐based snow model across a range of climatology and elevation
Mohammad M Sohrabi ... Rohan Benjankar
Hydrological processes | VOL. 33
Mohammad M Sohrabi, et. al.Mohammad M Sohrabi ... Rohan Benjankar
08 Feb 2019
Hydrological processes | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Fast convolutional neural networks on FPGAs with hls4ml

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Machine Learning: Science and Technology