The infrastructure for the Distributed System of Scientific Collections (DiSSCo) is in full development. Work within the DiSSCo Transition Project has been focused on building infrastructure, creating data models, and setting up Application Programming Interfaces (APIs) (Koureas et al. 2024). In the past years, DiSSCo has presented this work at different Biodiversity Information Standards (TDWG) conferences (Leeflang and Addink 2023, Leeflang et al. 2022, Addink et al. 2021). In this year’s session, we would like to focus on the human-facing application: DiSSCover. DiSSCover is the graphical user interface through which users can interact with Findable, Accessible, Interoperable and Reusable (FAIR) Digital Objects (FDOs), facilitating the curation and enhancement of specimen data (Islam 2024). Development started in 2022 and is ongoing. The interface acts as a gateway into the DiSSCo infrastructure, providing access to digital specimens and media. Extracted from the core DiSSCo API, the data is converted into an easily readable format and made discoverable through a diverse set of filters. DiSSCover’s main focus is to allow users to make annotations upon the data. Through the concept of annotations, we connect expert and machine-generated information, to create extended digital specimens (Hardisty et al. 2022), e.g, by creating linkages to other infrastructures, correcting or adding new information, or by triggering machine annotation services. Machine annotation services are automated scalable tools that run in the background and automatically curate and extend the specimen (Addink et al. 2023). Human users will remain important, as all annotations made by machine annotation services can be reviewed by a trusted person. Annotations are Fair Digital Objects and target a specific part of a specimen, be it a data fragment or an associated media file. At the heart of DiSSCover lies the Open Digital Specimen data specification (Leeflang and Addink 2023). It tries to harmonise multiple data standards into one generic specification based on the new Global Biodiversity Information Facility (GBIF) Unified Model (Robertson et al. 2022). The data is stored as JavaScript Object Notation (JSON) based on JSON Schemas (Anonymous 2024). Annotations are linked to specific data attributes using a JSON-path as the identifier. Data attributes can be individual terms, collections of terms called classes, or the whole object. This creates a flexible but complex data structure, the basis for which we used the World Wide Web Consortium (W3C) web annotation data model (Sanderson et al. 2017). The W3C Web annotation data model contains two main components: the target and body. The target specifies which data attribute the annotation is made on, for example, the term: ‘ods:specimenName’. This is a local term within the open Digital Specimen namespace (ods), which holds the accepted name of the digital specimen. The annotation body holds the value(s) that are appended to the digital specimen, and differ based upon the annotation motivation. DiSSCo recognises five different annotation motivations: addition, modification, comment, assessment and deletion, each of which has its own unique function. This creates a flexible structure that should be able to handle any information the user wants to add to the object. The challenge of DiSSCover is to preserve the complex structure of annotations, whilst making it convenient for users to work with. The session will provide a look at the different kinds of annotations and their use from a practical perspective. A demonstration of DiSSCover will show how users can create annotations, providing knowledge about the process that will give shape to DiSSCo’s main goal of enriching natural history data.
Read full abstract