Abstract

Graph neural networks have been shown to achieve excellent performance for several crucial tasks in particle physics, such as charged particle tracking, jet tagging, and clustering. An important domain for the application of these networks is the FGPA-based first layer of real-time data filtering at the CERN Large Hadron Collider, which has strict latency and resource constraints. We discuss how to design distance-weighted graph networks that can be executed with a latency of less than one μs on an FPGA. To do so, we consider a representative task associated to particle reconstruction and identification in a next-generation calorimeter operating at a particle collider. We use a graph network architecture developed for such purposes, and apply additional simplifications to match the computing constraints of Level-1 trigger systems, including weight quantization. Using the hls4ml library, we convert the compressed models into firmware to be implemented on an FPGA. Performance of the synthesized models is presented both in terms of inference accuracy and resource usage.

Highlights

  • At the CERN Large Hadron Collider (LHC), high-energy physics (HEP) experiments collect signals generated by the particles produced in high-energy proton collisions that occur every 25 ns, when two proton beams cross

  • We presented an implementation of a graph neural network algorithm as FPGA firmware with O(1) μs execution time

  • We described the simplified version of GARNET, which is available as a general-purpose graph network layer in the hls4ml library

Read more

Summary

INTRODUCTION

At the CERN Large Hadron Collider (LHC), high-energy physics (HEP) experiments collect signals generated by the particles produced in high-energy proton collisions that occur every 25 ns, when two proton beams cross. The successful deployment of the first machine learning (ML) L1T algorithm, based on a boosted decision tree (BDT), at the LHC (Acosta et al, 2018) has changed this tendency, raising interest in using ML inference as fast-to-execute approximations of complex algorithms with good accuracy This first example consisted of a large, pre-computed table of input and output values implementing a BDT, which raises the question of how to deploy more complex architectures. This question motivated the creation of hls4ml (Duarte et al, 2018; Loncar et al, 2020), a library designed to facilitate the deployment of ML algorithms on FPGAs. A typical hls4ml workflow begins with a neural network model that is implemented and trained using KERAS (Keras, 2015), PYTORCH (Paszke et al, 2019), or TENSORFLOW (Abadi et al, 2015).

RELATED WORK
GENERAL REQUIREMENTS AND CHALLENGES
A SIMPLIFIED GARNET LAYER IN THE HLS4ML FRAMEWORK
CASE STUDY
Dataset
Task and Model Architecture
Training Result
Model Synthesis and Performance
CONCLUSION
Findings
DATA AVAILABILITY STATEMENT
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call