Abstract

The High Luminosity LHC project at CERN, which is expected to deliver a ten-fold increase in the luminosity of proton-proton collisions over LHC, will start operation towards the end of this decade and will deliver an unprecedented scientific data volume of multi-exabyte scale. This vast amount of data has to be processed and analysed, and the corresponding computing facilities must ensure fast and reliable data processing for physics analyses by scientific groups distributed all over the world. The present LHC computing model will not be able to provide the required infrastructure growth, even taking into account the expected evolution in hardware technology. To address this challenge, several novel methods of how end-users analysis will be conducted are under evaluation by the ATLAS Collaboration. State-of-the-art workflow management technologies and tools to handle these methods within the existing distributed computing system are now being evaluated and developed. In addition the evolution of computing facilities and how this impacts ATLAS analysis workflows is being closely followed.

Highlights

  • The experiments at the Large Hadron Collider [1] use a worldwide complex and distributed computing infrastructure with almost 1 million computing cores and an exabyte of storage, interconnected through high-speed networks

  • The extreme computing needs of the experiments running from 2027 in the High Luminosity LHC (HL-LHC) era, primarily for data processing and analysis that are crucial for physics results, will not be satisfied by the current infrastructure, even allowing for the expected decrease in hardware costs

  • In Run-2 (2015-2018) the Analysis Object Data (AOD) datasets were processed in the ATLAS derivation framework [5], producing about 80 different Derived AOD (DAOD) formats that contain a subset of events and reduced reconstruction information tailored for specific physics analysis and performance groups

Read more

Summary

Introduction

The experiments at the Large Hadron Collider [1] use a worldwide complex and distributed computing infrastructure with almost 1 million computing cores and an exabyte of storage, interconnected through high-speed networks. Technologies that will address the HL-LHC computing challenges may be applicable for other scientific communities in high-energy physics, astronomy and beyond to analyse large-scale data volumes. In recent years there has been an explosion of ideas and technologies from the wider data science community, some of which can be and have been applied to analyses of ATLAS data These include Machine Learning and Deep Learning techniques, use of alternative hardware such as GPU and FPGAs, and a Python-based ecosystem of numerical libraries for vectorised array computation. To address the HL-LHC distributed data handling challenge, ATLAS has launched several R&D projects to study the feasibility of setting up dedicated computing facilities for end-user analysis, to evaluate new analysis workflows (many using Machine Learning and Artificial Intelligence), and to identify new tools to be developed to describe more complex analysis workflows. This paper describes the generation of analysis tools in ATLAS, ideas of the roles of analysis facilities and the changes required in distributed computing software

Physics analysis in ATLAS
The current computing infrastructure
Potential implementations of analysis facilities
Further considerations
New technologies and tools
Containers
New services and tools development
Authentication and authorisation implementations
Accelerators and distributed computing
Evolution of ATLAS distributed computing services
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call