Exploitation of network-segregated CPU resources in CMS

C Acosta-Silva,T Tannenbaum,A Delgado Peris,J Flix,J Frey,A Pérez-Calero Yzquierdo,J.M Hernández

doi:10.1051/epjconf/202125102020

Abstract

CMS is tackling the exploitation of CPU resources at HPC centers where compute nodes do not have network connectivity to the Internet. Pilot agents and payload jobs need to interact with external services from the compute nodes: access to the application software (CernVM-FS) and conditions data (Frontier), management of input and output data files (data management services), and job management (HTCondor). Finding an alternative route to these services is challenging. Seamless integration in the CMS production system without causing any operational overhead is a key goal. The case of the Barcelona Supercomputing Center (BSC), in Spain, is particularly challenging, due to its especially restrictive network setup. We describe in this paper the solutions developed within CMS to overcome these restrictions, and integrate this resource in production. Singularity containers with application software releases are built and pre-placed in the HPC facility shared file system, together with conditions data files. HTCondor has been extended to relay communications between running pilot jobs and HTCondor daemons through the HPC shared file system. This operation mode also allows piping input and output data files through the HPC file system. Results, issues encountered during the integration process, and remaining concerns are discussed.

Highlights

The CMS experiment is aiming towards an increased usage of High Performance Computing (HPC) resources to help cover their growing computing demands while gaining access to the best available computing technologies, usually employed at HPC sites
CMS is tackling the exploitation of CPU resources at HPC centers where compute nodes do not have network connectivity to the Internet
Pilot agents and payload jobs need to interact with external services from the compute nodes: access to the application software (CernVM-FS) and conditions data (Frontier), management of input and output data files, and job management (HTCondor)

Summary

Integration of HPC resources into CMS computing

The CMS experiment is aiming towards an increased usage of High Performance Computing (HPC) resources to help cover their growing computing demands while gaining access to the best available computing technologies, usually employed at HPC sites. In the current international landscape of ever larger scientific projects, growing funds are being committed to HPC centers, and funding agencies are encouraging their LHC national communities to make use of such resources in order to satisfy, at least partially, their computing power demands. This is even more relevant due to their expected increase in the mid to long term future (Run 3 and High-Luminosity LHC) [1], and the projected continuation of current levels of funding. Network access restrictions, often imposed by HPC centers, are hard to overcome by the current CMS computing model and employed technologies [3]

Network requirements for HPC resource exploitation by CMS

The Barcelona Supercomputing Center

The HTCondor split-starter model

Integration with CMS workload management systems

BSC functional and scale integration tests

Handling of input and output data files

Access to other external services

Findings

Conclusion and next steps

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: EPJ Web of Conferences	Publication Date: Jan 1, 2021
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Exploitation of network-segregated CPU resources in CMS

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences

Lead the way for us

Similar Papers

Exploiting network restricted compute resources with HTCondor: a CMS experiment experience
G.A Stewart ... José M Hernández
EPJ Web of Conferences | VOL. 245
G.A Stewart, et. al.G.A Stewart ... José M Hernández
01 Jan 2020
EPJ Web of Conferences | VOL. 245

Microcomputer codes for simulating transient ground-water flow: in two and three space dimensions
J.D Bredehoeft
-
J.D BredehoeftJ.D Bredehoeft
01 Jan 1991
01 Jan 1991

Exploitation of the MareNostrum 4 HPC using ARC-CE
Carles Acosta-Silva ... Javier Sánchez Martínez
EPJ Web of Conferences | VOL. 251
Carles Acosta-Silva, et. al.Carles Acosta-Silva ... Javier Sánchez Martínez
01 Jan 2020
EPJ Web of Conferences | VOL. 251

Beam simulation and radiation dose calculation at the Advanced Photon Source with shower, an interface program to the EGS4 code system
L Emery
-
L EmeryL Emery
01 May 1995
01 May 1995

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Exploitation of network-segregated CPU resources in CMS

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences