One of the most desired (and still missing) elements to enable the concept of Digital Extended (Webster et al. 2021) Specimens is a persistent identifier (PID) for the new digital specimen object. Digital Specimens are created to act as a digital surrogate of the physical objects. Digital Specimens contain all data relevant to the specimens as well as derived data like genetic sequences, trait information, and references to publications, species and environmental information. A PID for the Digital Specimen is thus essential to link it to the extended information. Furthermore, the extended information also needs PIDs to create unique and resolvable references to enable bidirectional linking. DiSSCo, Distributed System of Scientific Collections, has done extensive work to select the most appropriate PID scheme (Hardisty et al. 2021) and design a PID infrastructure for the pan-European specimen collections. The design was created in the Biodiversity Community Integrated Knowledge Library project (BiCIKL) with funding from the European Commission. The draft design has been discussed with technical specialists in the joint DiSSCo and Consortium of European Taxonomic Facilities (CETAF) community, with international stakeholders like GBIF and iDigBio. The draft was also presented at the 2022 conference of SPNHC (the Society for the Preservation of Natural History Collections) for further feedback, after which it was finalised. This talk will provide a short overview of the key elements in the design of the Persistent Identifier system for Digital Specimens such as the metadata schema, a human friendly string format, and PID lifecycle with support for state changes of the physical object such as splitting and merging. We will illustrate this by showing the current status of development. The Digital Specimen forms the backbone of the FAIR (Findable, Accessible, Interoperable, Reusable) services (Addink et al. 2021) in development by the DiSSCo Prepare project, a project to prepare for the new research infrastructure which currently involves 170 institutions in 23 countries holding an estimated 1,5 billion specimens. The Digital Specimen is a FAIR Digital Object with a PID and machine actionable metadata to make it FAIR. The PID service ensures the identifiers’ integrity and preserves these linkages, improving the FAIR quality of all specimen data. The talk will describe the system design and current test deployment setup, as well as delve into a few technical details of the Handle Server architecture, PID Kernel records, and a roadmap for the service development. The design and test work are done under the auspices of the BiCIKL project that aims to foster collaboration between infrastructures and develop bidirectional connections between the data so that they become part of a big FAIR data pool seamlessly available to researchers (Penev et al. 2022).
Read full abstract