Abstract
Predictions for requirements for the LHC computing for Run 3 and Run 4 (HLLHC) over the course of the next 10 years show a considerable gap between required and available resources, assuming budgets will globally remain flat at best. This will require some radical changes to the computing models for the data processing of the LHC experiments. Concentrating computational resources in fewer larger and more efficient centres should increase the cost-efficiency of the operation and, thus, of the data processing. Large scale general purpose HPC centres could play a crucial role in such a model. We report on the technical challenges and solutions adopted to enable the processing of the ATLAS experiment data on the European flagship HPC Piz Daint at CSCS, now acting as a pledged WLCG Tier-2 centre. As the transition of the Tier-2 from classic to HPC resources has been finalised, we also report on performance figures over two years of production running and on efforts for a deeper integration of the HPC resource within the ATLAS computing framework at different tiers.
Highlights
In view of the challenges posed by the foreseen scale of the High-Luminosity LHC runs (20252034) computing needs [1], significant R&D efforts have been put in place as part of the upgrade programs, in order to address the predicted shortfall of resources for reconstruction and offline computing resulting from higher trigger rates, larger event size and greater complexity
The integration of general purpose HPC machines with the LHC experiment frameworks has been a hot topic for several years as, in some circumstances, they hold the potential to offer a more cost-effective data processing infrastructure compared to dedicated resources in the form of commodity clusters
The solution architected for CSCS is based on the Tier-2 architecture, with one crucial difference: nodes used for the processing are no longer reserved and permanently customised for WLCG workloads
Summary
This will require optimisation of several aspects of experiment software, resource provisioning and usage, system performance and efficiency and data processing In this context, the integration of general purpose HPC machines with the LHC experiment frameworks has been a hot topic for several years as, in some circumstances, they hold the potential to offer a more cost-effective data processing infrastructure compared to dedicated resources in the form of commodity clusters. One of the reasons for this is that the WLCG workloads pose the additional requirements of interfacing some services that are external to the Piz Daint integrated network: in particular, mount points are needed for CVMFS and a dedicated GPFS scratch file system The latter is not one of the two shared Lustre file systems that are integrated with the Cray, and is hardened and tuned for the I/O patterns of WLCG jobs. For a full Tier-2 integration, the classic grid services are needed: ARC compute elements, dCache storage element, squids for Frontier and CVMFS, information system and VO boxes
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.