First experiences with a portable analysis infrastructure for LHC at INFN

Diego Ciangottini,Davide Salomoni,Daniele Spiga,Tommaso Boccali,Andrea Ceccanti,Tommaso Tedeschi,Mirco Tracolli,C Biscarat,C.I Rovelli,S Roiser,S Campana,B Hegner,G.A Stewart

doi:10.1051/epjconf/202125102045

Diego Ciangottini, Davide Salomoni + Show 11 more

Open Access

https://doi.org/10.1051/epjconf/202125102045

Copy DOI

Abstract

The challenges proposed by the HL-LHC era are not limited to the sheer amount of data to be processed: the capability of optimizing the analyser's experience will also bring important benefits for the LHC communities, in terms of total resource needs, user satisfaction and in the reduction of end time to publication. At the Italian National Institute for Nuclear Physics (INFN) a portable software stack for analysis has been proposed, based on cloud-native tools and capable of providing users with a fully integrated analysis environment for the CMS experiment. The main characterizing traits of the solution consist in the user-driven design and the portability to any cloud resource provider. All this is made possible via an evolution towards a “python-based” framework, that enables the usage of a set of open-source technologies largely adopted in both cloud-native and data-science environments. In addition, a “single sign on”-like experience is available thanks to the standards-based integration of INDIGO-IAM with all the tools. The integration of compute resources is done through the customization of a JupyterHUB solution, able to spawn identity-aware user instances ready to access data with no further setup actions. The integration with GPU resources is also available, designed to sustain more and more widespread ML based workflow. Seamless connections between the user UI and batch/big data processing framework (Spark, HTCondor) are possible. Eventually, the experiment data access latency is reduced thanks to the integrated deployment of a scalable set of caches, as developed in the context of ESCAPE project, and as such compatible with the future scenarios where a data-lake will be available for the research community. The outcome of the evaluation of such a solution in action is presented, showing how a real CMS analysis workflow can make use of the infrastructure to achieve its results.

Highlights

As the technologies and the challenges evolve, a new approach is needed when providing HL-LHC communities with all the tools needed to get their analysis work done
The end-user data analysis workflow is evolving under many aspects, with a new event data format called NanoAOD [1] designed by the CMS Collaboration [2] in order to satisfy the needs of a large fraction of physics analyses, with a per-event size of order of 1 kB; still, it contains all the top-level information typically used in the last steps of the analysis
Several initiatives are arising in this context such as those at CERN [4] and in US [5], and an effort is being made at the Italian National Institute for Nuclear Physics (INFN) in leveraging modern cloud-native paradigms to serve as building blocks for the analysis infrastructure, with the main objective of deploying a platform to be challenged and optimized in preparation of the HL-LHC era, a solution fully compatible with resources provisioning model and service portfolio composition strategy of the INFN-Cloud

Summary

Introduction

As the technologies and the challenges evolve, a new approach is needed when providing HL-LHC communities with all the tools needed to get their analysis work done. In terms of computing infrastructure this evolution brings the opportunity for R&D around new solutions that, on the one hand, offer the possibility to exploit models based e.g., python based WebUIs and, on the other hand, allow to optimize the throughput, a key optimization aspect for the analysis at CMS This translates into a model which foresees the usage of a well-equipped node with specialized hardware such as NVMe, many CPU and GPU cores enabling analysis activities at the MHz level and beyond. The primary motivations of the proposed architecture are to satisfy the shift toward interactivity as opposed to the GRID batch approach, as well as to maximize the throughput while analysing the experiment data This can be done by using a single node with specialized hardware, it is important to exploit scale out capability as well and, possibly, embedding everything in the very same deployment. This approach does not preclude the use of Singularity at the application level

Architecture overview

Identity and access management

Caching

On-demand computation and scale out

Autoscaling on custom metrics

Portability of the system and deployment strategy

First user experiences

Current experiences and lessons learnt

Conclusion and plans

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

First experiences with a portable analysis infrastructure for LHC at INFN

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences

Lead the way for us

Journal: EPJ Web of Conferences	Publication Date: Jan 1, 2021
License type: CC BY 4.0

Similar Papers

Temperature Dependence of the Debye–Waller Factor for 14.4 keV γ‐Rays of 57Fe Impurity Nuclei in Zn, Mo, and Sn Crystal Lattices
J Bara ... T Matlak
physica status solidi (b) | VOL. 14
J Bara, et. al.J Bara ... T Matlak
01 Jan 1965
Temperature Dependence of the Debye–Waller Factor for 14.4 keV γ‐Rays of 57Fe Impurity Nuclei in Zn, Mo, and Sn Crystal Lattices
J Bara ... T Matlak

The Temperature Dependence of the Effective Magnetic Field in Ni3 [Fe(CN)6]2
A Z Hrynkiewicz ... B D Sawicka
physica status solidi (b) | VOL. 38
A Z Hrynkiewicz, et. al.A Z Hrynkiewicz ... B D Sawicka
01 Jan 1970
physica status solidi (b) | VOL. 38

A model of diffractive production in hadron-hadron collisions: A. Biatas. Institute of Physics, Jagellonian University, Cracow and Institute of Nuclear Physics, Cracow, Bronowice, Poland, W. Czyż, Institute of Nuclear Physics, Cracow, Bronowice, Poland, and A. Kotański, Institute of Physics, Jagellonian University, Cracow, Poland
-
Annals of Physics | VOL. 72
--
01 Aug 1972
Annals of Physics | VOL. 72

Resonant self‐absorption correction for Ba119m SnO3 mössbauer sources
R Kmieć ... K Łątka
physica status solidi (b) | VOL. 79
R Kmieć, et. al.R Kmieć ... K Łątka
01 Feb 1977
Resonant self‐absorption correction for Ba119m SnO3 mössbauer sources
R Kmieć ... K Łątka

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

First experiences with a portable analysis infrastructure for LHC at INFN

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences