The ATLAS ARC backend to HPC

S Haug,M Weber,M Hostettler,F G Sciacca

doi:10.1088/1742-6596/664/6/062057

Abstract

The current distributed computing resources used for simulating and processing collision data collected by ATLAS and the other LHC experiments are largely based on dedicated x86 Linux clusters. Access to resources, job control and software provisioning mechanisms are quite different from the common concept of self-contained HPC applications run by particular users on specific HPC systems. We report on the development and the usage in ATLAS of a SSH backend to the Advanced Resource Connector (ARC) middleware to enable HPC compliant access and on the corresponding software provisioning mechanisms.

Highlights

The Worldwide LHC Computing Grid (WLCG) [1] has been set up to meet the needs of the ATLAS [2] and other CERN Large Hadron Collider (LHC) experiments’ software stacks
We developed an extension to the Advanced Resource Connector (ARC) middleware to include an interface to submit and manage ATLAS Production ANd Distributed Analysis framework (PanDA) workloads as jobs to a resource manager of a remote High Performance Computing (HPC) machine
The ARC-Computing Element (CE) interface to SLURM was modified to generate a special job script, which takes into account the hybrid SLURM/ALPS architecture of Cray HPC systems and runs the job through ALPS aprun when it is executed by SLURM

Summary

Introduction

The Worldwide LHC Computing Grid (WLCG) [1] has been set up to meet the needs of the ATLAS [2] and other CERN Large Hadron Collider (LHC) experiments’ software stacks. High Performance Computing (HPC) centres worldwide provide general purpose high-grade (non distributed) systems and are used for a wide range of computationally intensive tasks in various fields, including climate research, weather forecasting, molecular modelling, and quantum mechanics The use of such systems is typically regulated by strict rules, with single users granted access in order to run self-contained applications that are developed and built for the system’s architecture. Workload Management The Production ANd Distributed Analysis framework (PanDA) [11] is the ATLAS approach to a datadriven workload manager It has been designed and developed by ATLAS in order to meet the challenging requirements on throughput, scalability, robustness, minimal operations manpower, and efficiently integrated data/processing management. The middleware requirements to interface the workload management are a challenge, since these services cannot generally be implemented on demand in a HPC centre, and on computational nodes

Access to Resources

File system access

Findings

Conclusions

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Physics: Conference Series	Publication Date: Dec 1, 2015
Citations: 9	License type: cc-by

R Discovery Prime

R Discovery Prime

The ATLAS ARC backend to HPC

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Physics: Conference Series

Lead the way for us

Similar Papers

Performance of the NorduGrid ARC and the Dulcinea Executor in ATLAS Data Challenge 2
...
-
, et. al. ...
01 Jan 2004
01 Jan 2004

Advanced Resource Connector (ARC) – The Grid Middleware of the NorduGrid
Balázs Kónya
-
Balázs KónyaBalázs Kónya
01 Jan 2004
01 Jan 2004

The Advanced Resource Connector for Distributed LHC Computing
...
-
, et. al. ...
08 Oct 2009
08 Oct 2009

Recent ARC developments: Through modularity to interoperability
...
Journal of Physics: Conference Series | VOL. 219
, et. al. ...
01 Apr 2010
Journal of Physics: Conference Series | VOL. 219

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The ATLAS ARC backend to HPC

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Physics: Conference Series