Abstract

ALICE (A Large Ion Collider Experiment) is a heavy -ion detector studying the physics of strongly interacting matter and the quarkgluon plasma at CERN’s LHC (Large Hadron Collider). During the second long shutdown of the LHC, the ALICE detector will be upgraded to cope with an interaction rate of 50 kHz in Pb–Pb collisions, producing in the online computing system (O2) a sustained input throughput of 3 TB/s. The readout software is in charge of the first step of data-acquisition, handling the data transferred from over 8000 detector links to PC memory by dedicated PCI boards, formatting and buffering incoming traffic until sent to the next components in the processing pipeline. On the 250 readout nodes where it runs, the software has to sustain a throughput which can locally exceed 100 Gb/s. We present the modular design used to cope with various data sources (hardware devices and software emulators), integrated with the central O2 components (logging, configuration, monitoring, data sampling, transport) and initiating the online data flow using the standard O2 messaging system. Performance considerations and measurements are also discussed.

Highlights

  • ALICE [1] is the heavy-ion detector designed to cope with very high particle multiplicities to study the physics of strongly interacting matter at CERN’s LHC

  • Readout was tested with up to 8 CRUs in a server, using a Supermicro 4029GPTRT server providing 8 PCIe slots x16, half-duplex, allowing a measured aggregated throughput of 40 GB/s for the 16 equipments. This setup does not reach the maximum bandwidth of each CRU, this is a convenient system for batch testing of the CRUs

  • Readout is the process used in ALICE O2 to initiate DMA transfer from PCIe readout cards to PC memory, and the first software component in the O2 pipeline

Read more

Summary

ALICE and the O2 project

ALICE [1] is the heavy-ion detector designed to cope with very high particle multiplicities to study the physics of strongly interacting matter at CERN’s LHC. The Online -Offline system, named O2 [4], will be in charge of reading out these data and processing them onthe-fly, in order to reduce the volume down to 90 GB/s initially recorded to storage. These demanding data acquisition and processing steps will be handled by a computing farm consisting of ~250 nodes for readout (named FLPs – First Level Processors) and ~1500 nodes for online reconstruction (named EPNs – Event Processing Nodes). Data distribution to the EPNs is typically done round-robin, but may take into account load balancing requirements (network and CPUs availability)

Readout hardware
Readout software
Readout architecture
Multi-threaded pipeline
Memory management
Performance
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call