Long-Term Reusability of Biodiversity and Collection Data using a National Federated Data Infrastructure

Peter Grobe,Birgit Klasen,Tanja Weibulat,Maren Gleisberg,Anke Penzlin,Dagmar Triebel,Juan Monje

doi:10.3897/biss.3.37414

Abstract

GFBio “German Federation for Biological Data” is a data infrastructure and network set up by several research institutions in Germany. It fosters archiving and long-term reusability of research data and provides open and free access via a joint web portal at www.gfbio.org. As part of the working procedures data are semantically enriched and provided via a visualization and analysis tool. The main aim of the infrastructure is to make research data from the biological domain reusable and accessible on the long run following FAIR principles. In order to achieve this, several workflows and best practices have been established. The archiving of biodiversity and collection research data follows the reference model (ISO 14721) for an Open Archival Information System (OAIS). The challenges for making data reusable is on the one hand the heterogeneity of this data, on the other hand the often implicit but differing semantics making data integration a hard and difficult process. The use of data management plans is one approach we run to face and solve the challenges. Data management plans contain recipes about the research data, the tools used to acquire data, the content- and exchange formats, the metadata required to describe the data, and finally the costs and resources needed by data providers to deliver structured “Submission Information Packages” (SIPs) in the sense of OAIS. The archiving of a data package as “Archival Information Package” (AIP) is not sufficient to make it reusable in the future. Changes in the semantic meaning over time (content obsolescence), changes in the formats (format obsolescence), and changes in the technology of storage media (hardware obsolescence) are the major factors to be considered here. According to the FAIR principles and to our understanding data is best preserved if it is visible and available for use. The biodiversity and collection data centers involved in GFBio therefore have a curation layer (cf. management aka OAIS) in the archiving pipeline assembling their in-house management systems for sample and observation data and their asset management systems for all kinds of multimedia. This layer allows a continuous quality control and review of the incoming information packages. Thus, data providers can continuously maintain their data if wished for. The data are stored as AIPs sensu OAIS at the specialized data centers and are accessed by GFBio's core system. Dissemination Information Packages (DIPs) can be generated continuously at every time from the data and disseminated using content standards for data and metadata, like EML, ABCD and MIxS. Data are available via the GFBio website and in parallel using other web portals from the biological domain, e.g. INSDC and GBIF. The GFBio data centers now strive for certification of their archiving processes using the Core Trust Seal and for the certification of the FAIRness of single data records. Established data flows and documentation on best practices are available under: www.gfbio.org/data-centers.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Biodiversity Information Science and Standards	Publication Date: Jun 26, 2019
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Long-Term Reusability of Biodiversity and Collection Data using a National Federated Data Infrastructure

Abstract

Talk to us

Similar Papers

More From: Biodiversity Information Science and Standards

Lead the way for us

Similar Papers

IJDL special issue on complex digital objects: Guest editors' introduction
Michael L Nelson ... Herbert Van De Sompel
International Journal on Digital Libraries | VOL. 6
Michael L Nelson, et. al.Michael L Nelson ... Herbert Van De Sompel
24 Jan 2006
International Journal on Digital Libraries | VOL. 6

Os padrões de metadados como recursos tecnológicos para a garantia da preservação digital
Danilo Formenton ... Fabiano Ferreira De Castro
Biblios Journal of Librarianship and Information Science | VOL. -
Danilo Formenton, et. al.Danilo Formenton ... Fabiano Ferreira De Castro
12 Jan 2018
Biblios Journal of Librarianship and Information Science | VOL. -

DAITSS, an OAIS-based preservation repository
Priscilla Caplan
-
Priscilla CaplanPriscilla Caplan
29 Mar 2010
29 Mar 2010

Current Developments in the Research Data Repository RADAR
Felix Bach ... Kerstin Soltau
Research Ideas and Outcomes | VOL. 8
Felix Bach, et. al.Felix Bach ... Kerstin Soltau
12 Oct 2022
Research Ideas and Outcomes | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Long-Term Reusability of Biodiversity and Collection Data using a National Federated Data Infrastructure

Abstract

Talk to us

Similar Papers

More From: Biodiversity Information Science and Standards