Abstract

The Trigger and DAQ (TDAQ) system of the ATLAS experiment is a complex distributed computing system, composed of O(10, 000) of applications running on more than 2,500 computers. The system is operated by a crew of operators on shift. An important aspect of operations is to minimize the downtime of the system caused by runtime failures, such as human errors, unawareness or miscommunication. The paper describes recent developments in one of the intelligent TDAQ frameworks, the Shifter Assistant (SA) and summarizes the experience of it's use in operations of ATLAS during LHC Run 2. SA is a framework whose main aim is to automatize routine system checks, error detection and diagnosis, events correlations etc. in order to help the operators react to runtime problems promptly and effectively. The tool is based on CEP (Complex Event Processing) technology. It constantly processes this stream of operational events (O(100 kHz)) over a set of “directives” (or rules) in the knowledge base, producing human-oriented alerts and making shifters aware of the operational issues. More than 200 directives were developed by TDAQ and ATLAS detector experts for different domains. In this paper we also describe different types of directives which were developed in course of Run 2, and present few examples of most interesting and challenging ones, demonstrating the power of CEP for this type of applications.

Highlights

  • The ATLAS experiment [1] is one of the major experiments of the Large Hadron Collider (LHC)

  • The ATLAS Trigger & Data Acquisition (TDAQ) system [2] is responsible for the readout, selection and transfer of the selected physics events to the permanent storage, reducing the initial LHC collision frequency of 40 MHz to an average rate of stored physics events (1.5 MB size) of 2-3 kHz

  • The proceedings describe one of the TDAQ Controls tools, the Shifter Assistant, whose task is to intelligently help ATLAS operations; the proceedings summarize the experience with the Shifter Assistant in ATLAS data taking in course of LHC Run 2

Read more

Summary

Introduction

The ATLAS experiment [1] is one of the major experiments of the Large Hadron Collider (LHC). This general purpose experiment consists of various tracking detectors (the Inner Detector), electromagnetic and hadronic calorimeters and a muon spectrometer. The detector provides many millions of read-out channels, able to capture data every 25 nanoseconds. This volume of data can not be recorded and kept for further data analysis. TDAQ software includes central infrastructure services like Controls, Configuration and Monitoring. The proceedings describe one of the TDAQ Controls tools, the Shifter Assistant, whose task is to intelligently help ATLAS operations; the proceedings summarize the experience with the Shifter Assistant in ATLAS data taking in course of LHC Run 2

ATLAS operations challenges
Shifter Assistant
SA knowledge base and examples of directives
SA Replay
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call