Integration of a heterogeneous compute resource in the ATLAS workflow

Felix Bührer,Benoit Roland,Markus Schumacher,Ulrike Schnoor,Anton Gamel

doi:10.1051/epjconf/201921407014

Abstract

With the ever-growing amount of data collected with the experiments at the Large Hadron Collider (LHC), the need for computing resources that can handle the analysis of this data is also rapidly increasing. This increase will even be amplified after upgrading to the High Luminosity LHC [1]. High-Performance Computing (HPC) and other cluster computing resources provided by universities can be useful supplements to the resources dedicated to the experiment as part of the Worldwide LHC Computing Grid (WLCG) for data analysis and production of simulated event samples. Freiburg is operating a combined Tier2/Tier3, the ATLAS-BFG [2]. The shared HPC cluster "NEMO" at the University of Freiburg has been made available to local ATLAS [3] users through the provisioning of virtual machines incorporating the ATLAS software environment analogously to the bare metal system of the local ATLAS Tier2/Tier3 centre. In addition to the provisioning of the virtual environment, the on-demand integration of these resources into the Tier3 scheduler in a dynamic way is described. In order to provide the external NEMO resources to the user in a transparent way, an intermediate layer connecting the two batch systems is put into place. This resource scheduler monitors requirements on the user-facing system and requests resources on the backend-system.

Highlights

The analysis of collision data collected at the Large Hadron Collider (LHC) and simulation of events is primarily done at 2 Tier0, 13 Tier1 and 160 Tier2 sites within the Worldwide LHC Computing Grid (WLCG) [4]
High Performance Computing (HPC) clusters, as provided by universities and other institutions, sometimes even co-located at the same sites, may be used for High Throughput Computing (HTC)-like workflows to extend the capacities of the existing WLCG resources
An increase of the CPU performance by the order of 5% is observed when going from the Tier2/Tier3 bare metal to the NEMO virtual machines (VMs), while going from the NEMO VMs to the NEMO bare metal leads to a further increase of performance of the order of 5% as well

Summary

Introduction

The analysis of collision data collected at the LHC and simulation of events is primarily done at 2 Tier0 Tier and 160 Tier sites within the WLCG [4]. These benchmarks are used to quantify the modification of the performance due to changes in the configuration of the resource scheduler and of the virtual machines being spawned They will be part of a future continuous monitoring effort in order to be able to detect changes in the submitted workloads. A fully virtualized environment, independent of the choices made on the HPC cluster itself, will give the best possible scope to implement a system, that looks and behaves in the same way as the non-virtualized Tier2/Tier cluster This consistency between the two systems would make it possible in the future to redirect ATLAS grid jobs submitted remotely to either NEMO or any other opportunistic resource as long as the resource provides the needed infrastructure to run the VM images. This information will be used for continuous monitoring of the robustness and performance of the system

Generation of the virtual machines

Connection of front and backend batch systems

Findings

Summary

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: EPJ Web of Conferences	Publication Date: Jan 1, 2019
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Integration of a heterogeneous compute resource in the ATLAS workflow

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences

Lead the way for us

Similar Papers

Experiment Dashboard for Monitoring Computing Activities of the LHC Virtual Organizations
Julia Andreeva ... Irina Sidorova
Journal of Grid Computing | VOL. 8
Julia Andreeva, et. al.Julia Andreeva ... Irina Sidorova
28 Apr 2010
Journal of Grid Computing | VOL. 8

The Russian Segment (RU-VRF) in WLCG Infrastructure: High Performance Computing Network
Vasily E Velikhov ... Tatiana A Strizh
-
Vasily E Velikhov, et. al.Vasily E Velikhov ... Tatiana A Strizh
27 Oct 2022
27 Oct 2022

Performance analysis of a file catalog for the LHC computing grid
J.-P Baud ... S Lemaitre
-
J.-P Baud, et. al.J.-P Baud ... S Lemaitre
24 Jul 2005
24 Jul 2005

COOL, LCG conditions database for the LHC experiments: Development and deployment status
Andrea Valassi ... Gianni Pucciani
-
Andrea Valassi, et. al.Andrea Valassi ... Gianni Pucciani
01 Oct 2008
01 Oct 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Integration of a heterogeneous compute resource in the ATLAS workflow

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences