Abstract
In recent years the usage of machine learning techniques within data-intensive sciences in general and high-energy physics in particular has rapidly increased, in part due to the availability of large datasets on which such algorithms can be trained, as well as suitable hardware, such as graphic or tensor processing units, which greatly accelerate the training and execution of such algorithms. Within the HEP domain, the development of these techniques has so far relied on resources external to the primary computing infrastructure of the WLCG (Worldwide LHC Computing Grid). In this paper we present an integration of hardware-accelerated workloads into the Grid through the declaration of dedicated queues with access to hardware accelerators and the use of Linux container images holding a modern data science software stack. A frequent use-case in the development of machine learning algorithms is the optimization of neural networks through the tuning of their Hyper Parameters (HP). For this often a large range of network variations must be trained and compared, which for some optimization schemes can be performed in parallel – a workload well suited for Grid computing. An example of such a hyper-parameter scan on Grid resources for the case of flavor tagging within ATLAS is presented.
Highlights
The increase in dataset size and computing resource requirements for the HL-LHC is pushing WLCG experiments to look at Machine Learning (ML) techniques to improve the efficiency of data analysis and processing
Enabling GPUs on the WLCG Grid The WLCG distributed resources have been built around the HTC (High Throughput Computing) paradigm that focuses on the efficient execution of a large number of loosely-coupled tasks
The use of GPUs in ATLAS and more generally in WLCG may increase due to the introduction of ML and resources coming online at sites, but, at HPC centres
Summary
The increase in dataset size and computing resource requirements for the HL-LHC is pushing WLCG experiments to look at Machine Learning (ML) techniques to improve the efficiency of data analysis and processing. In this paper we present an integration of hardwareaccelerated workloads into the Grid through the declaration of dedicated queues with access to hardware accelerators and the use of Linux container images holding a modern data science software stack. An example of such a hyper-parameter scan on Grid resources for the case of flavor tagging within ATLAS is presented.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.