Dynamic Data Citation Service—Subset Tool for Operational Data Management

Chris Schubert,Katharina Sack,Georg Seyerl

doi:10.3390/data4030115

Abstract

In earth observation and climatological sciences, data and their data services grow on a daily basis in a large spatial extent due to the high coverage rate of satellite sensors, model calculations, but also by continuous meteorological in situ observations. In order to reuse such data, especially data fragments as well as their data services in a collaborative and reproducible manner by citing the origin source, data analysts, e.g., researchers or impact modelers, need a possibility to identify the exact version, precise time information, parameter, and names of the dataset used. A manual process would make the citation of data fragments as a subset of an entire dataset rather complex and imprecise to obtain. Data in climate research are in most cases multidimensional, structured grid data that can change partially over time. The citation of such evolving content requires the approach of “dynamic data citation”. The applied approach is based on associating queries with persistent identifiers. These queries contain the subsetting parameters, e.g., the spatial coordinates of the desired study area or the time frame with a start and end date, which are automatically included in the metadata of the newly generated subset and thus represent the information about the data history, the data provenance, which has to be established in data repository ecosystems. The Research Data Alliance Data Citation Working Group (RDA Data Citation WG) summarized the scientific status quo as well as the state of the art from existing citation and data management concepts and developed the scalable dynamic data citation methodology of evolving data. The Data Centre at the Climate Change Centre Austria (CCCA) has implemented the given recommendations and offers since 2017 an operational service on dynamic data citation on climate scenario data. With the consciousness that the objective of this topic brings a lot of dependencies on bibliographic citation research which is still under discussion, the CCCA service on Dynamic Data Citation focused on the climate domain specific issues, like characteristics of data, formats, software environment, and usage behavior. The current effort beyond spreading made experiences will be the scalability of the implementation, e.g., towards the potential of an Open Data Cube solution.

Highlights

Introduction on Dynamic Data CitationCiting datasets in an appropriate manner is agreed upon as good scientific praxis and well established
The overall objective of this article was to demonstrate the technical implementation and to provide the future potential of benefits regarding the RDA recommendations, with operational service offered as evidence, such as sustainable storage consumption using the Query Store for the data subset, and automatic adaptation into interoperable metadata description to keep the data provenance information
If only a fragment of a dataset is requested, which is served by subset functionalities, a more or less dynamic citation is required [9]

Summary

Introduction

Introduction on Dynamic Data CitationCiting datasets in an appropriate manner is agreed upon as good scientific praxis and well established. Data citation as a collection of text snippets provides information about the creator of the data, the title, the version, the repository, a time stamp, and a persistent identifier (PID) for persistent data access. With data-driven web services, the data used are not always static, especially in collaborative iteration and creation cycles [14] This is valid for climatological research, where different data sources and models serve as input for new data as derivatives, e.g., climate indices like calculation of the number of tropical nights, which is based on different climate model ensembles. From a data quality point of view, it is preferable that such derivatives be affected and updated automatically by the performed correction chain Such changes in consideration on dependencies in data creation should be communicated as automatically as possible. With the reproducibility of results in mind, it is essential to be able to accurately verify a particular dataset, its exact version, or the creation of data fragments

Objectives

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Data	Publication Date: Aug 1, 2019
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Dynamic Data Citation Service—Subset Tool for Operational Data Management

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Data

Lead the way for us

Similar Papers

AGU data citation community of practice - Credit for creators of data within collections using the concept of a reliquary
Justin Buck ... Uwe Schindler
-
Justin Buck, et. al.Justin Buck ... Uwe Schindler
04 Jan 2022
04 Jan 2022

Implementation of data citations and persistent identifiers at the ORNL DAAC
Robert B Cook ... J.H Kidder
Ecological Informatics | VOL. 33
Robert B Cook, et. al.Robert B Cook ... J.H Kidder
08 Mar 2016
Ecological Informatics | VOL. 33

DataCite: Lessons Learned on Persistent Identifiers for Research Data
Laura Rueda ... Martin Fenner
International Journal of Digital Curation | VOL. 11
Laura Rueda, et. al.Laura Rueda ... Martin Fenner
04 Jul 2017
International Journal of Digital Curation | VOL. 11

A Novel Part in the Swiss Army Knife for Linking Biodiversity Data: The digital specimen identifier service
Wouter Addink ... Sharif Islam
Biodiversity Information Science and Standards | VOL. 7
Wouter Addink, et. al.Wouter Addink ... Sharif Islam
07 Sep 2023
Biodiversity Information Science and Standards | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Dynamic Data Citation Service—Subset Tool for Operational Data Management

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Data