Abstract

The main goal of this work, in the context of WLCG, is to test a storage setup where the storage areas are geographically distributed and the system provides some pools behaving as data caches. Users can find data needed for their analysis in a local cache and process them locally. We first demonstrate that the distributed setup for a DPM storage is almost transparent to the users, in terms of performance and functionalities. Then, we implement a mechanism to fill the storage cache with data registered in Rucio Data Management system and we test it, running a physics analysis that gets its input data from the cache. Thus we demonstrate that the use of such a system can be useful for diskless sites with a local cache only, allowing to optimize the distribution and analysis of experimental data.

Highlights

  • The experience, gained in several years of distributed computing for large scientific communities, has shown that the Worldwide LHC Computing Grid – WLCG [1] – distributed storage infrastructure is very performing for the needs of the LHC experiments

  • The main goal of this work, in the context of WLCG, is to test a storage setup where the storage areas are geographically distributed and the system provides some pools behaving as data caches

  • User analysis is often based on clusters hosted in small sites such as Tier3s: it is important to provide dynamic and efficient data access in such sites as well

Read more

Summary

Introduction

The experience, gained in several years of distributed computing for large scientific communities, has shown that the Worldwide LHC Computing Grid – WLCG [1] – distributed storage infrastructure is very performing for the needs of the LHC experiments. The LHC experiments, WLCG and the Funding Agencies have started a process of optimization of the storage resources and human resources needed for storage operations. Some keywords for this ongoing process are: Distributed storage Common namespaces Data Caching Redundancy Different Quality of Service. Some disk pools, configured as volatile at different sites, work as local data caches with zoning access mechanisms. The proposed storage configuration has been tested in the context of the ATLAS experiment and tested with a real user analysis job getting its input data from the cache

DPM - Disk Pool Manager
The distributed storage configuration
The volatile pools as file caches
System Tests and results
Testing the Distributed setup
Testing the Cache
File caching in a real use case
Findings
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call