Abstract
The High Luminosity LHC project at CERN, which is expected to deliver a ten-fold increase in the luminosity of proton-proton collisions over LHC, will start operation towards the end of this decade and will deliver an unprecedented scientific data volume of multi-exabyte scale. This vast amount of data has to be processed and analysed, and the corresponding computing facilities must ensure fast and reliable data processing for physics analyses by scientific groups distributed all over the world. The present LHC computing model will not be able to provide the required infrastructure growth, even taking into account the expected evolution in hardware technology. To address this challenge, several novel methods of how end-users analysis will be conducted are under evaluation by the ATLAS Collaboration. State-of-the-art workflow management technologies and tools to handle these methods within the existing distributed computing system are now being evaluated and developed. In addition the evolution of computing facilities and how this impacts ATLAS analysis workflows is being closely followed.
Highlights
The experiments at the Large Hadron Collider [1] use a worldwide complex and distributed computing infrastructure with almost 1 million computing cores and an exabyte of storage, interconnected through high-speed networks
The extreme computing needs of the experiments running from 2027 in the High Luminosity LHC (HL-LHC) era, primarily for data processing and analysis that are crucial for physics results, will not be satisfied by the current infrastructure, even allowing for the expected decrease in hardware costs
In Run-2 (2015-2018) the Analysis Object Data (AOD) datasets were processed in the ATLAS derivation framework [5], producing about 80 different Derived AOD (DAOD) formats that contain a subset of events and reduced reconstruction information tailored for specific physics analysis and performance groups
Summary
The experiments at the Large Hadron Collider [1] use a worldwide complex and distributed computing infrastructure with almost 1 million computing cores and an exabyte of storage, interconnected through high-speed networks. Technologies that will address the HL-LHC computing challenges may be applicable for other scientific communities in high-energy physics, astronomy and beyond to analyse large-scale data volumes. In recent years there has been an explosion of ideas and technologies from the wider data science community, some of which can be and have been applied to analyses of ATLAS data These include Machine Learning and Deep Learning techniques, use of alternative hardware such as GPU and FPGAs, and a Python-based ecosystem of numerical libraries for vectorised array computation. To address the HL-LHC distributed data handling challenge, ATLAS has launched several R&D projects to study the feasibility of setting up dedicated computing facilities for end-user analysis, to evaluate new analysis workflows (many using Machine Learning and Artificial Intelligence), and to identify new tools to be developed to describe more complex analysis workflows. This paper describes the generation of analysis tools in ATLAS, ideas of the roles of analysis facilities and the changes required in distributed computing software
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.