Abstract

The present work aims at optimizing the use of computing resources available at the grid Italian Tier-2 sites of the ALICE experiment at CERN LHC by making them accessible to interactive distributed analysis, thanks to modern solutions based on cloud computing. The scalability and elasticity of the computing resources via dynamic (“on-demand”) provisioning is essentially limited by the size of the computing site, reaching the theoretical optimum only in the asymptotic case of infinite resources. The main challenge of the project is to overcome this limitation by federating different sites through a distributed cloud facility. Storage capacities of the participating sites are seen as a single federated storage area, preventing the need of mirroring data across them: high data access efficiency is guaranteed by location-aware analysis software and storage interfaces, in a transparent way from an end-user perspective. Moreover, the interactive analysis on the federated cloud reduces the execution time with respect to grid batch jobs. The tests of the investigated solutions for both cloud computing and distributed storage on wide area network will be presented.

Highlights

  • The computing models of the LHC experiments reflect an organization originally proposed by the MONARC model, that foresees a hierarchically organized set of computing centers, all connected in a computing Grid and each with a size compatible with deployment at a single institution

  • In order to provide our communities with resources for interactive parallel analysis, the ALICE collaboration has defined a standard for the deployment of Analysis Facility based on PROOF (Parallel ROOT Facility), an extension of the ROOT framework

  • The physicists of the ALICE collaboration have had great advantage beyond their expectations by analyzing data on the Analysis Facilities: the first published works were fully based on analyses that were run on the Analysis Facility at CERN, which quickly became inadequate for the needs of the experiment

Read more

Summary

Introduction

The computing models of the LHC experiments reflect an organization originally proposed by the MONARC model, that foresees a hierarchically organized set of computing centers, all connected in a computing Grid and each with a size compatible with deployment at a single institution. The project aims to federate the Italian ALICE analysis centres, taking full advantage of the good network connectivity provided by GARR-X, through a distributed cloud in order to be able to run cloud based applications over data existing in any of the federation members Such a federated cloud extends the data available for interactive analysis to the whole set managed by Italian ALICE Tier-2 sites, reducing execution time with respect to grid batch job and the need of data duplication with respect to different “unfederated” analysis facilities. To take full advantage of wide area networks the protocols should respond well to the increasing latency between the two connection endpoints Their functionality should withstand mistakes and failures, and they should facilitate the copy of files from one site to another, and the chance to read well-defined file chunks without transferring unnecessary data. Doi:10.1088/1742-6596/664/2/022033 protocols that allow access to WAN data such as HTTP and XROOTD are already used by the scientific community and in particular by LHC experiments

Italian infrastructure for the Virtual Analysis Facility
Production activities on the Torino site
Benchmarking
Monitoring
Data access and storage federation
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.