Abstract

At the Helmholtz Association, we strive to establish a well-formed harmonized data space, connecting information across distributed data infrastructures. This requires standardizing the description of data sets with suitable metadata to achieve interoperability and machine actionability. One way to make connections between datasets and to avoid redundancy in metadata is the consistent use of Persistent Identifiers (PIDs). PIDs are an integral element of the FAIR principles (Wilkinson et al. 2016) and recommended to refer to data sets. But also to other meta data such as people, organizations, projects, laboratories, repositories, publications, vocabularies, samples, instruments, licenses, and methods should be commonly referenced by PIDs, but not for all of these agreed identifiers exist. Consistently integrating the existing PIDs into data infrastructures can create a high level of interoperability allowing to build connections between data sets from different repositories according to common meta information. In HMC we start this process by implementing PIDs for people (ORCID) and organizations (ROR) in data infrastructures. Harmonizing PID metadata, however, is only the first step in setting up a data space. Here we shed some light on which strategies we recommend for the implementation within the Helmholtz Association and make suggestions, which stakeholder groups should be included in order to hold them responsible for maintaining them to shape the Helmholtz Data Space. The conclusions from this process do not only affect the implementation of PID metadata, but may also be used for the harmonization of vocabularies, digital objects, interfaces, licenses, quality flags and others, in order to connect our global data systems, to redefine stakeholder responsibility and to ultimately reach the data space.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call