Abstract

The CERN ATLAS experiment successfully uses a worldwide computing infrastructure to support the physics program during LHC Run 2. The Grid workflow system PanDA routinely manages 250 to 500 thousand concurrently running production and analysis jobs to process simulation and detector data. In total more than 370 PB of data is distributed over more than 150 sites in the WLCG and handled by the ATLAS data management system Rucio. To prepare for the ever growing LHC luminosity in future runs new developments are underway to even more efficiently use opportunistic resources such as HPCs and utilize new technologies. This paper will review and explain the outline and the performance of the ATLAS distributed computing system and give an outlook to new workflow and data management ideas for the beginning of the LHC Run 3. It will be discussed that the ATLAS workflow and data management systems are robust, performant and can easily cope with the higher Run 2 LHC performance. There are presently no scaling issues and each subsystem is able to sustain the large loads.

Highlights

  • During Run 2 the Large Hadron Collider (LHC) continues to deliver large amounts of data to the experiments

  • The distributed computing system of the ATLAS experiment [1] as outlined in Fig. 1 right is built around the two main components: the workflow management system PanDA [3] and the data management system Rucio [4]

  • It manages the computing resources to process this data at the Tier-0 at CERN, reprocesses it once per year at the Tier-1 and Tier-2 WLCG [5] Grid sites and runs continuous Monte Carlo (MC) simulation and reconstruction

Read more

Summary

Introduction

During Run 2 the Large Hadron Collider (LHC) continues to deliver large amounts of data to the experiments (see Fig. 1 left). The distributed computing system of the ATLAS experiment [1] as outlined in Fig. 1 right is built around the two main components: the workflow management system PanDA [3] and the data management system Rucio [4]. It manages the computing resources to process this data at the Tier-0 at CERN, reprocesses it once per year at the Tier-1 and Tier-2 WLCG [5] Grid sites and runs continuous Monte Carlo (MC) simulation and reconstruction. In addition continuous distributed analysis from several hundred ATLAS users is executed. The ATLAS distributed computing system is performing very well in handling and processing all this data

Data processing in 2017 and 2018
Data management
Production activities
Tier-0
Distributed Analysis
Monitoring and Analytics
Evolution of the systems
Findings
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.