DiSSCo(The Distributed System of Scientific Collections) is a Research Infrastructure (RI) aiming at providing unified physical (transnational), remote (loans) and virtual (digital) access to the approximately 1.5 billion biological and geological specimens in collections across Europe. DiSSCo represents the largest ever formal agreement between natural science museums (114 organisations across 21 European countries). With political and financial support across 14 European governments and a robust governance model DiSSCo will deliver, by 2025, a series of innovative end-user discovery, access, interpretation and analysis services for natural science collections data. As part of DiSSCo's developing data model, we evaluate the application of Digital Objects (DOs), which can act as the centrepiece of its architecture. DOs have bit-sequences representing some content, are identified by globally unique persistent identifiers (PIDs) and are associated with different types of metadata. The PIDs can be used to refer to different types of information such as locations, checksums, types and other metadata to enable immediate operations. In the world of natural science collections, currently fragmented data classes (inter alia genes, traits, occurrences) that have derived from the study of physical specimens, can be re-united as parts in a virtual container (i.e., as components of a Digital Object). These typed DOs, when combined with software agents that scan the data offered by repositories, can act as complete digital surrogates of the physical specimens. In this paper we: investigate the architectural and technological applicability of DOs for large scale data RIs for bio- and geo-diversity, identify benefits and challenges of a DO approach for the DiSSCo RI and describe key specifications (incl. metadata profiles) for a specimen-based new DO type.
Read full abstract