Reaching new peaks for the future of the CMS HTCondor Global Pool

A Pérez-Calero Yzquierdo,K Hurtado Anampa,N Peregonov,M Mascheroni,J Dost,E Kizinevič,M Acosta Flechas,F A Khan,S Haleem,For The Cms Collaboration

doi:10.1051/epjconf/202125102055

Abstract

The CMS experiment at CERN employs a distributed computing infrastructure to satisfy its data processing and simulation needs. The CMS Submission Infrastructure team manages a dynamic HTCondor pool, aggregating mainly Grid clusters worldwide, but also HPC, Cloud and opportunistic resources. This CMS Global Pool, which currently involves over 70 computing sites worldwide and peaks at 350k CPU cores, is employed to successfully manage the simultaneous execution of up to 150k tasks. While the present infrastructure is sufficient to harness the current computing power scales, CMS latest estimates predict a noticeable expansion in the amount of CPU that will be required in order to cope with the massive data increase of the High-Luminosity LHC (HL-LHC) era, planned to start in 2027. This contribution presents the latest results of the CMS Submission Infrastructure team in exploring and expanding the scalability reach of our Global Pool, in order to preventively detect and overcome any barriers in relation to the HL-LHC goals, while maintaining high effciency in our workload scheduling and resource utilization.

Highlights

The Submission Infrastructure (SI) team runs the computing infrastructure in which processing, reconstruction, simulation, and analysis of the CMS experiment physics data takes place
Opportunistic, High Performance Computing (HPC) and Cloud resources have been added to the Global Pool, currently aggregating over 300,000 CPU cores routinely, in an increasing proportion compared to the standard Grid sites slots, Considering the growing scales of data to be collected by CMS in the LHC High Luminosity (HL-LHC) phase, driven by increasing detector trigger rates and event complexity, CMS published in 2020 its estimated computational needs in the future [16]
In order to explore increasingly larger scales, test pools can be simulated by running multiple multi-core startd daemons for each GlideinWMS pilot job running on the Grid

Summary

The CMS Submission Infrastructure

The Submission Infrastructure (SI) team runs the computing infrastructure in which processing, reconstruction, simulation, and analysis of the CMS experiment physics data takes place. A number of CMS sites have expanded their computing capacity by locally aggregating resources from High Performance Computing (HPC) facilities in a transparent way for CMS, as exemplified by the CNAF [9] and KIT [10] cases, where pilot jobs arriving at the sites’ compute elements are in turn rerouted to the HPC cluster batch system. This approach follows the CMS strategy to employ HPC resources whenever available [11]. Additional external pools can be federated into the SI by enabling flocking from the CMS schedds

Scalability requirements for the Global Pool

Scalability tests

Testing setup

Full infrastructure test

Overall evaluation of the test results

Findings

Conclusions

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: EPJ web of conferences	Publication Date: Jan 1, 2021
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Reaching new peaks for the future of the CMS HTCondor Global Pool

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ web of conferences

Lead the way for us

Similar Papers

Scaling up ATLAS Event Service to production levels on opportunistic computing platforms
D Benjamin ... J Hover
Journal of Physics: Conference Series | VOL. 762
D Benjamin, et. al.D Benjamin ... J Hover
01 Oct 2016
Journal of Physics: Conference Series | VOL. 762

HEP Analyses on Dynamically Allocated Opportunistic Computing Resources
M J Schnepf ... E Kuehn
Journal of Physics: Conference Series | VOL. 1525
M J Schnepf, et. al.M J Schnepf ... E Kuehn
01 Apr 2020
Journal of Physics: Conference Series | VOL. 1525

Managing the CMS Data and Monte Carlo Processing during LHC Run 2
C Wissing
Journal of Physics: Conference Series | VOL. 898
C WissingC Wissing
01 Oct 2017
Journal of Physics: Conference Series | VOL. 898

Non-Grid Opportunistic Resources for (Big Data) Volunteer Computing

-

01 Feb 2017
01 Feb 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Reaching new peaks for the future of the CMS HTCondor Global Pool

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ web of conferences