Abstract
Input data for applications that run in cloud computing centres can be stored at remote repositories, typically with multiple copies of the most popular data stored at many sites. Locating and retrieving the remote data can be challenging, and we believe that federating the storage can address this problem. In this approach, the closest copy of the data is used based on geographical or other information. Currently, we are using the dynamic data federation, Dynafed, a software solution developed by CERN IT. Dynafed supports several industry standard interfaces, such as Amazon S3, Microsoft Azure and HTTP with WebDAV extensions. Dynafed functions as an abstraction layer under which protocol-dependent authentication details are hidden from the user, requiring the user to only provide an X509 certificate. We have set up an instance of Dynafed and integrated it into the ATLAS distributed data management system, Rucio. We report on the challenges faced during the installation and integration.
Highlights
We would like to run data-intensive applications on globally distributed opportunistic resources that have no local storage
The ATLAS [1] experiment leverages a globally distributed system of infrastructure as a service (IaaS) clouds as part of its distributed computing system. These resources are integrated into the ATLAS distributed computing system using the Cloudscheduler [2] technology developed at the University of Victoria
In this paper we describe a system leveraging Cloudscheduler and Dynafed, which successfully executed functional test jobs and transfers as part of the ATLAS distributed computing and data management systems on the CERN OpenStack [7] cloud resource
Summary
We would like to run data-intensive applications on globally distributed opportunistic resources that have no local storage. The ATLAS [1] experiment leverages a globally distributed system of infrastructure as a service (IaaS) clouds as part of its distributed computing system These resources are integrated into the ATLAS distributed computing system using the Cloudscheduler [2] technology developed at the University of Victoria. These IaaS resources do not support any local grid infrastructure. In this paper we describe a system leveraging Cloudscheduler and Dynafed, which successfully executed functional test jobs and transfers as part of the ATLAS distributed computing and data management systems on the CERN OpenStack [7] cloud resource. Data were read from and written to an object store implemented using Ceph [8] and exposing an S3 compatible gateway
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have