Overview of the ATLAS distributed computing system

Johannes Elmsheuser,Alessandro Di Girolamo,A Forti,M Litmaath,O Smirnova,P Hristov,L Betev

doi:10.1051/epjconf/201921403010

Johannes Elmsheuser, Alessandro Di Girolamo + Show 5 more

Open Access

https://doi.org/10.1051/epjconf/201921403010

Copy DOI

Journal: EPJ Web of Conferences	Publication Date: Jan 1, 2019
Citations: 7	License type: CC BY 4.0

Affiliation: Brookhaven National Laboratory

Abstract

The CERN ATLAS experiment successfully uses a worldwide computing infrastructure to support the physics program during LHC Run 2. The Grid workflow system PanDA routinely manages 250 to 500 thousand concurrently running production and analysis jobs to process simulation and detector data. In total more than 370 PB of data is distributed over more than 150 sites in the WLCG and handled by the ATLAS data management system Rucio. To prepare for the ever growing LHC luminosity in future runs new developments are underway to even more efficiently use opportunistic resources such as HPCs and utilize new technologies. This paper will review and explain the outline and the performance of the ATLAS distributed computing system and give an outlook to new workflow and data management ideas for the beginning of the LHC Run 3. It will be discussed that the ATLAS workflow and data management systems are robust, performant and can easily cope with the higher Run 2 LHC performance. There are presently no scaling issues and each subsystem is able to sustain the large loads.

Highlights

During Run 2 the Large Hadron Collider (LHC) continues to deliver large amounts of data to the experiments
The distributed computing system of the ATLAS experiment [1] as outlined in Fig. 1 right is built around the two main components: the workflow management system PanDA [3] and the data management system Rucio [4]
It manages the computing resources to process this data at the Tier-0 at CERN, reprocesses it once per year at the Tier-1 and Tier-2 WLCG [5] Grid sites and runs continuous Monte Carlo (MC) simulation and reconstruction

Summary

Introduction

During Run 2 the Large Hadron Collider (LHC) continues to deliver large amounts of data to the experiments (see Fig. 1 left). The distributed computing system of the ATLAS experiment [1] as outlined in Fig. 1 right is built around the two main components: the workflow management system PanDA [3] and the data management system Rucio [4]. It manages the computing resources to process this data at the Tier-0 at CERN, reprocesses it once per year at the Tier-1 and Tier-2 WLCG [5] Grid sites and runs continuous Monte Carlo (MC) simulation and reconstruction. In addition continuous distributed analysis from several hundred ATLAS users is executed. The ATLAS distributed computing system is performing very well in handling and processing all this data

Data processing in 2017 and 2018

Data management

Production activities

Tier-0

Distributed Analysis

Monitoring and Analytics

Evolution of the systems

Findings

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Overview of the ATLAS distributed computing system

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences

Lead the way for us

Similar Papers

ATLAS Distributed Computing Experience and Performance During the LHC Run-2
A Filipčič
Journal of Physics: Conference Series | VOL. 898
A FilipčičA Filipčič
01 Oct 2017
Journal of Physics: Conference Series | VOL. 898

Advancing Synthetic Ecology: A Database System to Facilitate Complex Ecological Meta‐Analyses
V Bala Chaudhary ... Gail W.T Wilson
The Bulletin of the Ecological Society of America | VOL. 91
V Bala Chaudhary, et. al.V Bala Chaudhary ... Gail W.T Wilson
01 Apr 2010
The Bulletin of the Ecological Society of America | VOL. 91

The Consistency of Task-Based Authorization Constraints in Workflow Systems
...
-
, et. al. ...
28 Jun 2004
28 Jun 2004

Application of ActiveXML in Adaptive Work Flow Engine Design
Huang Qinwei ... Guo Jianlong
-
Huang Qinwei, et. al.Huang Qinwei ... Guo Jianlong
01 Apr 2020
01 Apr 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Overview of the ATLAS distributed computing system

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences