Abstract

Input data for applications that run in cloud computing centres can be stored at remote repositories, typically with multiple copies of the most popular data stored at many sites. Locating and retrieving the remote data can be challenging, and we believe that federating the storage can address this problem. In this approach, the closest copy of the data is used based on geographical or other information. Currently, we are using the dynamic data federation, Dynafed, a software solution developed by CERN IT. Dynafed supports several industry standard interfaces, such as Amazon S3, Microsoft Azure and HTTP with WebDAV extensions. Dynafed functions as an abstraction layer under which protocol-dependent authentication details are hidden from the user, requiring the user to only provide an X509 certificate. We have set up an instance of Dynafed and integrated it into the ATLAS distributed data management system, Rucio. We report on the challenges faced during the installation and integration.

Highlights

  • We would like to run data-intensive applications on globally distributed opportunistic resources that have no local storage

  • The ATLAS [1] experiment leverages a globally distributed system of infrastructure as a service (IaaS) clouds as part of its distributed computing system. These resources are integrated into the ATLAS distributed computing system using the Cloudscheduler [2] technology developed at the University of Victoria

  • In this paper we describe a system leveraging Cloudscheduler and Dynafed, which successfully executed functional test jobs and transfers as part of the ATLAS distributed computing and data management systems on the CERN OpenStack [7] cloud resource

Read more

Summary

Introduction

We would like to run data-intensive applications on globally distributed opportunistic resources that have no local storage. The ATLAS [1] experiment leverages a globally distributed system of infrastructure as a service (IaaS) clouds as part of its distributed computing system These resources are integrated into the ATLAS distributed computing system using the Cloudscheduler [2] technology developed at the University of Victoria. These IaaS resources do not support any local grid infrastructure. In this paper we describe a system leveraging Cloudscheduler and Dynafed, which successfully executed functional test jobs and transfers as part of the ATLAS distributed computing and data management systems on the CERN OpenStack [7] cloud resource. Data were read from and written to an object store implemented using Ceph [8] and exposing an S3 compatible gateway

Data access
Authentication and authorization
Checksums
Results
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call