Building and using containers at HPC centres for the ATLAS experiment

Douglas Benjamin,Danila Oleynik,Xin Zhao,Vakho Tsulaia,Wei Yang,Taylor Childers,David Lesny,Sergey Panitkin,A Forti,O Smirnova,L Betev,P Hristov,M Litmaath

doi:10.1051/epjconf/201921407005

Abstract

The HPC environment presents several challenges to the ATLAS experiment in running their automated computational workflows smoothly and efficiently, in particular regarding issues such as software distribution and I/O load. A vital component of the LHC Computing Grid, CVMFS, is not always available in HPC environments. ATLAS computing has experimented with all-inclusive containers, and later developed an environment to produce such containers for both Shifter and Singularity. The all-inclusive containers include most of the recent ATLAS software releases, database releases, and other tools extracted from CVMFS. This helped ATLAS to distribute software automatically to HPC centres with an environment identical to those in CVMFS. It also significantly reduced the metadata I/O load to HPC shared file systems. The production operation at NERSC has proved that by using this type of containers, we can transparently fit into the previously developed ATLAS operation methods, and at the same time scale up to run many more jobs.

Highlights

The Grid computing model developed by the World Wide LHC Computing Grid (WLCG) provided most of the computing resource for the LHC Run 1 and Run 2
Special attention is needed on the per file hard link limit imposed by file system during the building process - we found that a few files can have an many as 900 k hard links each
To speed up operation on large number of small files during CVMFS data extraction and deduplication, as well as squashfs and rsync, we use the same technology we proposed to use on the HPCs – create a large EXT3 file system in a GPFS file and loop mount it

Summary

Motivation

The Grid computing model developed by the World Wide LHC Computing Grid (WLCG) provided most of the computing resource for the LHC Run 1 and Run 2. It is clear that the ATLAS experiment [3] needs to explore non-Grid opportunistic resources in large quantity in order to satisfy the need of LHC Run 3 and Run 4. Supercomputers such as Cori [4] and Edison [5] at NERSC in the United States, Theta [6] at ALCF, Titan [7] at OLCF, Piz Daint [8] at CSCS in Switzerland and MaroNostrum [9] at BSC in Spain are usually much larger systems. This paper will discuss two of those challenges: making ATLAS software available on HPCs and reducing metadata I/O on HPC shared file systems

Making ATLAS software available on HPCs

Building all-inclusive containers for ATLAS

Extracting CVMFS contents and deduplication

Filtering

Packing software into a container

Container building environment

Use cases

Next steps

Conclusion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: EPJ Web of Conferences	Publication Date: Jan 1, 2019
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Building and using containers at HPC centres for the ATLAS experiment

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences

Lead the way for us

Similar Papers

Extending WLCG Tier-2 Resources using HPC and Cloud Solutions
Jiří Chudoba ... Michal Svatos
-
Jiří Chudoba, et. al.Jiří Chudoba ... Michal Svatos
12 Dec 2018
12 Dec 2018

Security Monitoring and Analytics in the Context of HPC Processing Model
Michał Pilc ... Maciej Miłostan
-
Michał Pilc, et. al.Michał Pilc ... Maciej Miłostan
01 Jan 2018
01 Jan 2018

Dynamic Resource Management in a HPC and Cloud Hybrid Environment
Miao Chen ... Fang Dong
-
Miao Chen, et. al.Miao Chen ... Fang Dong
01 Jan 2013
01 Jan 2013

Experiences with Cross-Facility Real-Time Light Source Data Analysis Workflows
Johannes P Blaschke ... Lavanya Ramakrishnan
-
Johannes P Blaschke, et. al.Johannes P Blaschke ... Lavanya Ramakrishnan
01 Nov 2021
01 Nov 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Building and using containers at HPC centres for the ATLAS experiment

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences