Data citation and digital identifiers for time series data / environmental research infrastructures

Robert Huber ,Jesús Marco De Lucas ,Ari Asmi ,Alberto Michelini ,Michael Diepenbroek ,Justin Buck ,Envri

doi:10.6084/m9.figshare.1285728.v1

Abstract

In the age of data driven science the re-use of data and the compilation of existing data from monitoring infrastructures has become an integral part of research. For the sake of transparency and reproducibility of research it is crucial to be able to unambiguously identify data that were used as the basis of a publication. Globally unique and resolvable, persistent digital identifiers (PID) for digital data sets are an important tool to achieve this goal enabling unambiguous links between published research results and their underlying data. In addition, this unambiguous identification allows citation of data. Proven and community based examples are the usage of GenBank identifiers in the biological literature or the data citation method by using DOIs (digital object identifiers) already used widely in the scholarly literature. Identification of discrete digital objects is simple and citation can be formatted in analogy to citing literature. The identification of still ongoing, open time series does not seem to fit this pattern. A major prerequisite for the proper use of PIDs within data citations is the persistence of both, identifiers as well as the integrity of the associated data set. This poses questions when PIDs are to be used for unfinished data sets or open time series data. Such data is typically generated within research infrastructures during long lasting experiments such as satellite missions, environmental monitoring campaigns, or in permanent installations such as natural hazard detection and early warning systems (e.g., seismic traces acquired by field stations). Open time series data are often used in research during ongoing experiments and potentially published earlier than the underlying data set has been closed and is publicly released. It is therefore important to enable the scientific community to properly cite these data in their publications. Yet what is the meaning of “persistence” of data in ongoing time series? How does it relate to versioning? What is the granularity of a time series? In this publication we discuss and compare solutions currently used in some major European research infrastructures and propose transparent solutions which allow the citation of time series data using PIDs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Data citation and digital identifiers for time series data / environmental research infrastructures

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Identification and Long-lasting Citability of Dynamic Data Queries on EMSO ERIC Harmonized Data
Ivan Rodero ... Andreu Fornós
-
Ivan Rodero, et. al.Ivan Rodero ... Andreu Fornós
28 Mar 2022
28 Mar 2022

Reference Model Guided System Design and Implementation for Interoperable Environmental Research Infrastructures
Zhiming Zhao ... Paul Martin
-
Zhiming Zhao, et. al.Zhiming Zhao ... Paul Martin
01 Aug 2015
01 Aug 2015

Why do Geodetic Data need DOIs? First ideas of the GGOS DOI Working Group
Kirsten Elger ... Glenda Coetzer
-
Kirsten Elger, et. al.Kirsten Elger ... Glenda Coetzer
23 Mar 2020
23 Mar 2020

Data Always Getting Bigger—A Scalable DOI Architecture for Big and Expanding Scientific Data
Giri Prakash ... Biva Shrestha
Data | VOL. 1
Giri Prakash, et. al.Giri Prakash ... Biva Shrestha
31 Aug 2016
Data | VOL. 1

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Data citation and digital identifiers for time series data / environmental research infrastructures

Abstract

Talk to us

Similar Papers