ATLAS TDAQ System Administration: evolution and re-design

S Ballestrero,D A Scannicchio,F Brasolin,A Korol,M S Twomey,C Contescu,D Fazio,S Dubrov,A Bogdanchikov,C J Lee

doi:10.1088/1742-6596/664/8/082024

Abstract

The ATLAS Trigger and Data Acquisition system is responsible for the online processing of live data, streaming from the ATLAS experiment at the Large Hadron Collider at CERN. The online farm is composed of ∼3000 servers, processing the data read out from ∼100 million detector channels through multiple trigger levels. During the two years of the first Long Shutdown there has been a tremendous amount of work done by the ATLAS Trigger and Data Acquisition System Administrators, implementing numerous new software applications, upgrading the OS and the hardware, changing some design philosophies and exploiting the High- Level Trigger farm with different purposes. The OS version has been upgraded to SLC6; for the largest part of the farm, which is composed of net booted nodes, this required a completely new design of the net booting system. In parallel, the migration to Puppet of the Configuration Management systems has been completed for both net booted and local booted hosts; the Post-Boot Scripts system and Quattor have been consequently dismissed. Virtual Machine usage has been investigated and tested and many of the core servers are now running on Virtual Machines. Virtualisation has also been used to adapt the High-Level Trigger farm as a batch system, which has been used for running Monte Carlo production jobs that are mostly CPU and not I/O bound. Finally, monitoring the health and the status of ∼3000 machines in the experimental area is obviously of the utmost importance, so the obsolete Nagios v2 has been replaced with Icinga, complemented by Ganglia as a performance data provider. This paper serves for reporting of the actions taken by the Systems Administrators in order to improve and produce a system capable of performing for the next three years of ATLAS data taking.

Highlights

After three years of Large Hadron Collider (LHC) beam (Run 1) and two years of upgrades performed during the Long Shutdown 1 (LS1), the LHC has officially restarted.During the time of the LS1, a tremendous amount of work has been done by the ATLAS [1] Trigger and Data Acquisition (TDAQ) Systems Administrators (SysAdmins)
Software applications have been implemented, operating systems (OS) and most of the hardware has been upgraded, some design philosophies have been changed and the SysAdmins have exploited the High-Level Trigger (HLT) [2] computing farm to be used as a batch system to run specialised offline jobs
There is one Local File Servers (LFS) installed per rack for the HLT farm, with a couple being shared between various sub-detector systems

Summary

Introduction

After three years of Large Hadron Collider (LHC) beam (Run 1) and two years of upgrades performed during the Long Shutdown 1 (LS1), the LHC has officially restarted. During the time of the LS1, a tremendous amount of work has been done by the ATLAS [1] Trigger and Data Acquisition (TDAQ) Systems Administrators (SysAdmins).

Primary Author

Conclusion

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Physics: Conference Series	Publication Date: Dec 1, 2015
Citations: 1	License type: cc-by

R Discovery Prime

ATLAS TDAQ System Administration: evolution and re-design

Abstract

Highlights

Summary

Published Version

Talk to us

Similar Papers

More From: Journal of Physics: Conference Series

Lead the way for us

Similar Papers

Readiness of the ATLAS Trigger and Data Acquisition system for the first LHC beams
W Vandelli
Nuclear Physics B (Proceedings Supplements) | VOL. 197
W VandelliW Vandelli
01 Dec 2009
Nuclear Physics B (Proceedings Supplements) | VOL. 197

ATLAS Trigger and Data Acquisition: Capabilities and commissioning
D.A Scannicchio
Nuclear Inst. and Methods in Physics Research, A | VOL. 617
D.A ScannicchioD.A Scannicchio
28 Jul 2009
Nuclear Inst. and Methods in Physics Research, A | VOL. 617

Using the Grid to Test the ATLAS Trigger and Data Acquisition System at Large Scale
Alessandra Forti ... Thorsten Wengler
IEEE Transactions on Nuclear Science | VOL. 54
Alessandra Forti, et. al.Alessandra Forti ... Thorsten Wengler
01 Oct 2007
IEEE Transactions on Nuclear Science | VOL. 54

A web-based solution to visualize operational monitoring data in the Trigger and Data Acquisition system of the ATLAS experiment at the LHC
G Avolio ... M D’Ascanio
Journal of Physics: Conference Series | VOL. 898
G Avolio, et. al.G Avolio ... M D’Ascanio
01 Oct 2017
Journal of Physics: Conference Series | VOL. 898

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

ATLAS TDAQ System Administration: evolution and re-design

Abstract

Highlights

Summary

Published Version

Talk to us

Similar Papers

More From: Journal of Physics: Conference Series