Abstract

When preparing the Data Management Plan for larger scientific endeavors, PIs have to balance between the most appropriate qualities of storage space along the line of the planned data life-cycle, its price and the available funding. Storage properties can be the media type, implicitly determining access latency and durability of stored data, the number and locality of replicas, as well as available access protocols or authentication mechanisms. Negotiations between the scientific community and the responsible infrastructures generally happen upfront, where the amount of storage space, media types, like: disk, tape and SSD and the foreseeable data life-cycles are negotiated. With the introduction of cloud management platforms, both in computing and storage, resources can be brokered to achieve the best price per unit of a given quality. However, in order to allow the platform orchestrator to programmatically negotiate the most appropriate resources, a standard vocabulary for different properties of resources and a commonly agreed protocol to communicate those, has to be available. In order to agree on a basic vocabulary for storage space properties, the storage infrastructure group in INDIGO-DataCloud together with INDIGO-associated and external scientific groups, created a working group under the umbrella of the Research Data Alliance (RDA). As communication protocol, to query and negotiate storage qualities, the Cloud Data Management Interface (CDMI) has been selected. Necessary extensions to CDMI are defined in regular meetings between INDIGO and the Storage Network Industry Association (SNIA). Furthermore, INDIGO is contributing to the SNIA CDMI reference implementation as the basis for interfacing the various storage systems in INDIGO to the agreed protocol and to provide an official Open-Source skeleton for systems not being maintained by INDIGO partners.

Highlights

  • SCRATCH FAST (the need for speed) OUTPUT (external redundant copies) OUTPUT (not yet redundant) LOW-COST (latency not an issue) ARCHIVAL (expensive to recreate).

  • CHEP 2016 at San Francisco, USA 2016­10­11 https://indico.cern.ch/event/505613/contributions/2230920/

  • Paul Millar (DESY), Marcus Hardt (KIT), Vladimir Sapunenko (INFN-CNAF) Giacinto Donvito (INFN-Bari)

Read more

Summary

Introduction

SCRATCH FAST (the need for speed) OUTPUT (external redundant copies) OUTPUT (not yet redundant) LOW-COST (latency not an issue) ARCHIVAL (expensive to recreate). CHEP 2016 at San Francisco, USA 2016­10­11 https://indico.cern.ch/event/505613/contributions/2230920/

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call