Abstract

Over the last decades, several data preservation efforts have been undertaken by the HEP community, as experiments are not repeatable and consequently their data considered unique. ARCHIVER is a European Commission (EC) co-funded Horizon 2020 pre-commercial procurement project procuring R&D combining multiple ICT technologies including data-intensive scalability, network, service interoperability and business models, in a hybrid cloud environment. The results will provide the European Open Science Cloud (EOSC) with archival and preservation services covering the full research lifecycle. The services are co-designed in partnership with four research organisations (CERN, DESY, EMBL-EBI and PIC/IFAE) deploying use cases from Astrophysics, HEP, Life Sciences and Photon-Neutron Sciences creating an innovation ecosystem for specialist data archiving and preservation companies willing to introduce new services capable of supporting the expanding needs of research. The HEP use cases being deployed include the CERN Opendata portal, preserving a second copy of the completed BaBar experiment and the CERN Digital Memory digitising CERN’s multimedia archive of the 20th century. In parallel, ARCHIVER has established an Early Adopter programme whereby additional use cases can be incorporated at each of the project phases thereby expanding services to multiple research domains and countries.

Highlights

  • Data has both a value and an associated cost

  • Acting as a collective of procurers identified as the Buyers Group, CERN[6], EMBL-EBI[7], DESY[8] and PIC[9] commit funds, research datasets and testing effort to create an innovation ecosystem for specialist ICT companies active in archiving and digital preservation willing to co-develop end-to-end archival and preservation services for data generated in the context of scientific research supporting the IT requirements of European scientists

  • The R&D challenge of digital archiving goes beyond storing data: keeps intellectual control of data and associated products for decades, rendering research outputs reusable

Read more

Summary

Introduction

Data has both a value and an associated cost. Research data management makes many promises in terms of capacity, scalability, ease-of-use and security. CERN has announced a new open data policy for scientific experiments at the Large Hadron Collider (LHC) that will make scientific research more reproducible, accessible, and collaborative[2]. In spite of these efforts, there is a critical gap in offering services that are standards-based, cost-effective for longterm archiving and preservation[3]. This is evident when in presence of high sustained data ingest rates (10 Gbps) and research data volume of multiple petabytes. Acting as a collective of procurers identified as the Buyers Group, CERN[6], EMBL-EBI[7], DESY[8] and PIC[9] commit funds, research datasets and testing effort to create an innovation ecosystem for specialist ICT companies active in archiving and digital preservation willing to co-develop end-to-end archival and preservation services for data generated in the context of scientific research supporting the IT requirements of European scientists

ARCHIVER approach
Design phase
Pilot phase
Sustainability of the resulting services after the project
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call