Abstract

This contribution details the deployment of Rucio, the ATLAS Distributed Data Management system. The main complication is that Rucio interacts with a wide variety of external services, and connects globally distributed data centres under different technological and administrative control, at an unprecedented data volume. It is therefore not possible to create a duplicate instance of Rucio for testing or integration. Every software upgrade or configuration change is thus potentially disruptive and requires fail-safe software and automatic error recovery. Rucio uses a three-layer scaling and mitigation strategy based on quasi-realtime monitoring. This strategy mainly employs independent stateless services, automatic failover, and service migration. The technologies used for deployment and mitigation include OpenStack, Puppet, Graphite, HAProxy and Apache. In this contribution, the interplay between these components, their deployment, software mitigation, and the monitoring strategy are discussed.

Highlights

  • The high-energy physics experiment ATLAS creates non-trivial amounts of data [1]

  • Many components interact with Rucio, most importantly the the workload management system PanDA [5], an uninterrupted service is required

  • Rucio leverages all features for load-balancing and fault-tolerance provided by OpenStack and Puppet, and implements custom handling on the application side as well

Read more

Summary

Home Search Collections Journals About Contact us My IOPscience

Scalable and fail-safe deployment of the ATLAS Distributed Data Management system Rucio. This content has been downloaded from IOPscience. Please scroll down to see the full text. Ser. 664 062027 (http://iopscience.iop.org/1742-6596/664/6/062027) View the table of contents for this issue, or go to the journal homepage for more. Download details: IP Address: 137.138.124.206 This content was downloaded on 24/02/2016 at 13:19 Please note that terms and conditions apply. 21st International Conference on Computing in High Energy and Nuclear Physics (CHEP2015) IOP Publishing. Journal of Physics: Conference Series 664 (2015) 062027 doi:10.1088/1742-6596/664/6/062027

Introduction
Findings
Published under licence by IOP Publishing Ltd
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call