Abstract

This paper evaluates the utilization of Remote Direct Memory Access (RDMA) over Converged Ethernet (RoCE) for the Run 3 LHCb event building at CERN. The acquisition system of the detector will collect partial data from approximately 1000 separate detector streams. The total estimated throughput equals 32 Terabits per second. Full events will be assembled for subsequent processing and data selection in the filtering farm of the online trigger. High-throughput transmissions with up to 90% links utilization will be an essential feature of the system. The data exchange mechanism must support zero-copy transmissions. In this work, the RoCE high-throughput kernel bypass Ethernet protocol is benchmarked as a potential alternative to InfiniBand. A RoCE-based event building network is presented and two implementations are considered. The former variant combined shallow-buffered and deep-buffered switches with enabled flow control. In the latter setup, only deep-buffered devices are used, where operation relied on their memory throughput and capacity. Feasibility tests were conducted with selected Ethernet switches. Memory bandwidth utilization was investigated, in comparison with InfiniBand. Relevant utilization and interoperability issues of RoCE flow control are detailed with lessons learned along the road.

Highlights

  • This paper evaluates the utilization of Remote Direct Memory Access (RDMA) over Converged Ethernet (RoCE) for the Run 3 LHCb event building at CERN

  • The RoCE high-throughput kernel bypass Ethernet protocol is benchmarked as a potential alternative to InfiniBand

  • CERN LHCb detector upgrade for upcoming Run 3 [1] imposed significant changes in the data acquisition system [2]

Read more

Summary

Introduction

CERN LHCb detector upgrade for upcoming Run 3 [1] imposed significant changes in the data acquisition system [2]. The currently developed Software HighLevel Trigger will have to handle 32 Terabits per second of input. Fragments of events from all detector parts must be assembled into one structure and dispatched across the nodes handling the trigger data selection. With the currently considered 100 Gbit/s links, efficient transmissions with a small memory footprint can be reached with a Remote Direct Memory Access. This zero-copy protocol, as opposed to TCP and UDP, does not make a copy of data in the kernel invocation and. This drastically lowers necessary memory bandwidth and CPU usage. A protocol called “RDMA over Converged Ethernet version 2” (RoCE v2) [5] has recently become a promising and potentially applicable alternative

Advantages and limitations of RoCE v2
Proposal of Ethernet RoCE v2 based Event Builder
Long-run RoCE v2 test and comparison with InfiniBand
Evaluating switches throughput for the deep-buffered only variant
Findings
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.