Abstract

The ATLAS Distributed Data Management system stores more than 220PB of physics data across more than 130 sites globally. Rucio, the next generation data management system of the ATLAS collaboration, has now been successfully operated for two years. However, with the increasing workload and utilization, more automated and advanced methods of managing the data are needed. In this article we present an extension to the data management system, which is in charge of detecting and foreseeing storage elements reaching and surpassing their capacity limit. The system automatically and dynamically rebalances the data to other storage elements, while respecting and guaranteeing data distribution policies and ensuring the availability of the data. This concept not only lowers the operational burden, as these cumbersome procedures had previously to be done manually, but it also enables the system to use its distributed resources more efficiently, which not only affects the data management system itself, but in consequence also the workload management and production systems. This contribution describes the concept and architecture behind those components and shows the benefits made by the system.

Highlights

  • The ATLAS distributed data management system Rucio organizes over 220 petabytes of physics data across more than 130 sites worldwide

  • Destination selection: Key to the destination selection is the preservation of the original distribution policy

  • The algorithm looks up the replication policy for the original data placement and selects a storage element, out of this set, based on weighted selection on available space

Read more

Summary

Detecting imbalances

The ATLAS distributed data management system Rucio organizes over 220 petabytes of physics data across more than 130 sites worldwide. The aggregated transfer volume ranges between 1 and 2 petabytes every day. With the increased workload of Run 2 and the heavy utilization of data transfers, imbalances between computing slots and available capacity arise regularly. To compensate for this, the Balancer of Bits (BB8) daemon was developed

Destination selection
Conclusion & Future work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call