Abstract

The ALICE experiment at the CERN LHC focuses on studying the quark-gluon plasma produced by heavy-ion collisions. Starting from 2021, it will see its input data throughput increase a hundredfold, up to 3.5 TB/s. To cope with such a large amount of data, a new online-offline computing system, called O2, will be deployed. It will synchronously compress the data stream by a factor of 35 down to 100 GB/s before storing it permanently.One of the key software components of the system will be the data Quality Control (QC). This framework and infrastructure is responsible for all aspects related to the analysis software aimed at identifying possible issues with the data itself, and indirectly with the underlying processing done both synchronously and asynchronously. Since analyzing the full stream of data online would exceed the available computational resources, a reliable and efficient sampling will be needed. It should provide a few percent of data selected randomly in a statistically sound manner with a minimal impact on the main dataflow. Extra requirements include e.g. the option to choose data corresponding to the same collisions over a group of computing nodes.In this paper the design of the O2 Data Sampling software is presented. In particular, the requirements for pseudo-random number generators to be used for sampling decisions are highlighted, as well as the results of the benchmarks performed to evaluate different possibilities. Finally, a large scale test of the O2 Data Sampling is reported.

Highlights

  • The ALICE experiment at the CERN Large Hadron Collider (LHC) focuses on studying the quark-gluon plasma produced by heavy-ion collisions

  • The online data Quality Control will be performed by more than 100 QC Tasks running each in parallel on many nodes, which will spy on various data types generated in consecutive processing stages

  • The task of sampling and providing the data to QC tasks as well as other potential clients will be performed by the Data Sampling software, which is the topic of this paper

Read more

Summary

The ALICE experiment

ALICE (A Large Ion Collider Experiment) [1] is one of the four major particle detectors at the CERN Large Hadron Collider (LHC). It is designed to study the quark-gluon plasma by observing fundamental and composite particles appearing in the debris produced by heavy-ion and proton collisions. Starting from November 2009, the ALICE experiment has successfully recorded data, which allowed several measurements of the properties of that primordial state of matter

The ALICE upgrade
Quality Control in the O2 system
Data sampling requirements
Data sampling design
Rationale and approaches for sampling data
Evaluated sampling methods
Sampling methods tests
Tests results
Comprehensive data sampling benchmarks
Findings
Summary
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call