ATLAS utilisation of the Czech national HPC center

Michal Svatos,Petrri Vokac,Jiri Chudoba

doi:10.1051/epjconf/201921403005

Michal Svatos, Petrri Vokac + Show 1 more

Open Access

https://doi.org/10.1051/epjconf/201921403005

Copy DOI

Abstract

The Czech national HPC center IT4Innovations located in Ostrava provides two HPC systems, Anselm and Salomon. The Salomon HPC is amongst the hundred most powerful supercomputers on Earth since its commissioning in 2015. Both clusters were tested for usage by the ATLAS experiment for running simulation jobs. Several thousand core hours were allocatedto the project for tests, but the main aim is to use free resources waitigfor large parallel jobs of other users. Multiple strategies for ATLAS job execution were tested on the Salomon and Anselm HPCs. The solution described herein is based on the ATLAS experience with other HPC sites. ARC Compute Element (ARCCE) installed at the grid site in Prague is used for job submission to Salomon. The ATLAS production system submits jobs to the ARC-Evia ARC Control Tower (aCT). The ARC-CE processes job requirements from aTand creates a script for a batch system which is then executed via ssh. Sshfs is used to share scripts and input files between the site and the HPC cluster. The software used to run jobs is rsynced from the site's CVMFS installation to the HPC's scratch space every day to ensure availabiliy of recent software. Using this setting, opportunistic capacity of the Salomon HPC was exploited.

Highlights

The Czech National Supercomputer Center IT4Innovations in Ostrava operates the Salomon HPC system which is the most powerful computer in the Czech Republic; it is listed in Top500 and ranked 87th in the world as of November 2017 [1]
Salomon was built in 2015, providing 2 PFLOPs in peak performance. It consists of 1008 computational nodes with 24 cores of Intel Xeon E5 CPUs and 128 GB of RAM per node interconnected with Infiniband (56 Gbps)
Jobs waiting for resources for a long time are killed in the batch system and reassigned by ATLAS production system to another site to ensure timely completion of tasks

Summary

Introduction

The Czech National Supercomputer Center IT4Innovations in Ostrava operates the Salomon HPC system which is the most powerful computer in the Czech Republic; it is listed in Top500 and ranked 87th in the world as of November 2017 [1]. Salomon was built in 2015, providing 2 PFLOPs in peak performance. It consists of 1008 computational nodes with 24 cores of Intel Xeon E5 CPUs and 128 GB of RAM per node interconnected with Infiniband (56 Gbps). 432 nodes contain 61 core Intel Xeon Phi accelerators. The ATLAS Experiment in CERN [2] uses the Salomon cluster in opportunistic fashion via the Czech Tier site (praguelcg2) [3]. Non-accelerated nodes are available for opportunistic usage and unlike other opportunistic HPC resources, there is no job pre-emption. Jobs waiting for resources for a long time are killed in the batch system and reassigned by ATLAS production system to another site to ensure timely completion of tasks. The first successful ATLAS job submitted to the Salomon via ARC-CE finished in December 2017 and since Salomon continuously and significantly contributes to Czech Tier ATLAS production output

RunTime Environment

Software

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: EPJ Web of Conferences	Publication Date: Jan 1, 2019
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

ATLAS utilisation of the Czech national HPC center

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences

Lead the way for us

Similar Papers

Improvements in utilisation of the Czech national HPC center
Michal Svatoš ... G.A Stewart
EPJ Web of Conferences | VOL. 245
Michal Svatoš, et. al.Michal Svatoš ... G.A Stewart
01 Jan 2020
EPJ Web of Conferences | VOL. 245

The neuroscience gateway portal: high performance computing made easy
Ted Carnevale ... Anita Bandrowski
BMC Neuroscience | VOL. 15
Ted Carnevale, et. al.Ted Carnevale ... Anita Bandrowski
01 Jul 2014
BMC Neuroscience | VOL. 15

Concept of a Cloud Service for Data Preparation and Computational Control on Custom HPC Systems in Application to Molecular Dynamics
Dmitry Puzyrkov ... D Podgainy
EPJ Web of Conferences | VOL. 173
Dmitry Puzyrkov, et. al.Dmitry Puzyrkov ... D Podgainy
01 Jan 2018
EPJ Web of Conferences | VOL. 173

Using Pilot Jobs and CernVM File System for Simplified Use of Containers and Software Distribution
Namratha Urs ... Marco Mambelli
-
Namratha Urs, et. al.Namratha Urs ... Marco Mambelli
21 Jun 2021
21 Jun 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

ATLAS utilisation of the Czech national HPC center

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences