Abstract
The EU-funded ESCAPE project aims at enabling a prototype federated storage infrastructure, a Data Lake, that would handle data on the exabyte-scale, address the FAIR data management principles and provide science projects a unified scalable data management solution for accessing and analyzing large volumes of scientific data. In this respect, data transfer and management technologies such as Rucio, FTS and GFAL are employed along with monitoring enabling solutions such as Grafana, Elasticsearch and perf- SONAR. This paper presents and describes the technical details behind the machinery of testing and monitoring of the Data Lake – this includes continuous automated functional testing, network monitoring and development of insightful visualizations that reflect the current state of the system. Topics that are also addressed include the integration with the CRIC information system as well as the initial support for token based authentication / authorization by using OpenID Connect. The current architecture of these components is provided and future enhancements are discussed.
Highlights
The European Union funded ESCAPE project (European Science Cluster of Astronomy & Particle physics ESFRI research infrastructures) [1] [2] consists of a synergy between ESFRI projects and science organizations [3] which aim at establishing a single collaborative cluster of generation facilities in the area of astronomy and accelerator-based particle physics in order to implement a functional link between those and EOSC [4]
One of the main objectives of the Data Infrastructure for Open Science work package (DIOS) is building a prototype scalable federated data infrastructure, a Data Lake, that would be able to handle data on the exabyte-scale and address the FAIR [7] data management principles, that is, the data would need to be Findable, Accessible, Interoperable and Reusable. This infrastructure would facilitate the access to the scientific data via the ESAP science platform and provide the tools and documentation for such platforms to seamlessly
This paper presents an overview of the architecture of this system and focuses on the methodology and technologies that are used in order to test and monitor the data management capabilities of it
Summary
The European Union funded ESCAPE project (European Science Cluster of Astronomy & Particle physics ESFRI research infrastructures) [1] [2] consists of a synergy between ESFRI projects and science organizations [3] which aim at establishing a single collaborative cluster of generation facilities in the area of astronomy and accelerator-based particle physics in order to implement a functional link between those and EOSC [4]. One of the main objectives of the Data Infrastructure for Open Science work package (DIOS) is building a prototype scalable federated data infrastructure, a Data Lake, that would be able to handle data on the exabyte-scale and address the FAIR [7] data management principles, that is, the data would need to be Findable, Accessible, Interoperable and Reusable. This infrastructure would facilitate the access to the scientific data via the ESAP science platform and provide the tools and documentation for such platforms to seamlessly.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have