Abstract
During LHC Run 1, the information flow through the offline data quality monitoring in ATLAS relied heavily on chains of processes polling each other's outputs for handshaking purposes. This resulted in a fragile architecture with many possible points of failure and an inability to monitor the overall state of the distributed system. We report on the status of a project undertaken during the LHC shutdown to replace the ad hoc synchronization methods with a uniform message queue system. This enables the use of standard protocols to connect processes on multiple hosts; reliable transmission of messages between possibly unreliable programs; easy monitoring of the information flow; and the removal of inefficient polling-based communication.
Highlights
The ATLAS detector [1] offline data quality (DQ) system [2, 3] consists of a heterogeneous set of applications running on multiple nodes at CERN
Atlas.dqm.progress: General information on ATLAS DQ and run status is published on this topic
To make it easier for ATLAS collaborators to use the messaging bus system without having to get their own password, the relevant file is stored in CERN AFS and protected with relevant ACLs
Summary
This content has been downloaded from IOPscience. Please scroll down to see the full text. Ser. 664 062045 (http://iopscience.iop.org/1742-6596/664/6/062045) View the table of contents for this issue, or go to the journal homepage for more. Download details: IP Address: 137.138.124.206 This content was downloaded on 24/02/2016 at 12:19 Please note that terms and conditions apply. 21st International Conference on Computing in High Energy and Nuclear Physics (CHEP2015) IOP Publishing. Journal of Physics: Conference Series 664 (2015) 062045 doi:10.1088/1742-6596/664/6/062045
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have