Abstract

Data infrastructures manage the life cycle of digital assets and allow users to efficiently discover them. To improve the Findability, Accessibility, Interoperability and Re-usability (FAIRness) of digital assets, a data infrastructure needs to provide digital assets with not only rich meta information and semantics contexts information but also globally resolvable identifiers. The Persistent Identifiers (PIDs), like Digital Object Identifier (DOI) are often used by data publishers and infrastructures. The traditional IP network and client-server model can potentially cause congestion and delays when many consumers simultaneously access data. In contrast, Information Centric Networking (ICN) technologies such as Named Data Networking (NDN) adopt a data centric approach where digital data objects, once requested, may be stored on intermediate hops in the network. Consecutive requests for that unique digital object are then made available by these intermediate hops (caching). This approach distributes traffic load more efficient and reliable compared to host-to-host connection oriented techniques, and demonstrates attractive opportunities for sharing digital objects across distributed networks. However, such an approach also faces several challenges. It requires not only an effective translation between the different naming schemas among PIDs and NDN, in particular for supporting PIDs from different publishers or repositories. Moreover, the planning and configuration of an ICN environment for distributed infrastructures are lacking an automated solution. To bridge the gap, we propose an ICN planning service with specific consideration of interoperability across PID schemas in the Cloud environment.

Highlights

  • Data infrastructures manage the life cycle of data assets, and allow users to effectively discover and utilize data for their specific purposes

  • We extend our earlier work of Named Data Networking (NDN)-as-aservice for Persistent Identifiers (PIDs) data objects (NaaS4PID) [3], and look at three aspects in applying NDN in data infrastructures: 1) how to seamlessly publish version controlled digital objects with a PID, 2) how to plan and deploy a customized virtual NDN environment for user communities on different infrastructures, and 3) how to distribute digital objects from distributed data sources with heterogeneous PID publishers

  • The researcher explained that the difference in NDN naming of different PID providers must be taken into account, such that the correct

Read more

Summary

INTRODUCTION

Data infrastructures manage the life cycle of data assets, and allow users to effectively discover and utilize data for their specific purposes. To effectively discover and utilize resources across different infrastructures, users still face the challenges of limited FAIRness of digital assets It becomes a common problem for many research infrastructures to improve their FAIRness; several projects have been funded for this purpose. While Network Functions Virtualization (NFV) [2] may provide efficient resource management and sharing over common network infrastructures These advanced technologies still lack seamless support for being embedded in the data management life cycle, e.g., for discovering and sharing digital objects with PIDs. Koulouzis et al discussed the feasibility to use ICN solutions like NDN to distribute the digital objects with PIDs. the work did not solve the challenge of the incompatible (heterogeneous) nature between the PID standards, and only implemented the PID schema from a single publisher [3]. In the rest of the section, we will review the related work of these topics and discuss the key contributions of our work

Persistent Identifiers
PID objects over ICN
Named Data Networking
Network planning and customization
Summary
PROPOSED SOLUTION
PID assignment during data management
Support multiple PID repositories
NDN on virtual infrastructures
NDN automation
SYSTEM PROTOTYPE
Performance characterstics
SUMMARY
A SeaDataNet case study
Discussion
Conclusions
Future work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call