Abstract

This article aims to evaluate how and to what extent metadata of datasets indexed in DataCite offer clear human- or machine-readable information that enables the research data to be linked to a particular research institution. Two main pathways are explored. First, researchers can encode their affiliation information at the moment of data submission. This can be done by means of free-text metadata fields or via the inclusion of identifiers such as GRID/ROR and ORCID. Second, affiliation information can be traced indirectly through linking between a dataset and associated publications, given that the metadata of publications is often more explicit about affiliation information than the metadata of datasets. Both pathways of affiliation information encoding are evaluated on the basis of metadata pertaining to datasets created at the five Flemish universities. It is shown that good practices such as encoding of affiliation information in a dedicated metadata field or inclusion of ORCID in the metadata are on the rise, but could be expanded further. Finally, the establishment of links between datasets and related publications is often lacking in dataset metadata, although there are important differences between data repositories, as is also demonstrated in a more data-intensive follow-up analysis based on random samples of metadata records. It is important that data repositories address this issue by providing a metadata field clearly dedicated to associated publications, prominently displayed on the landing page of the dataset.

Highlights

  • Van WettereData Science JournalIn the broader context of funder requirements with regard to research data and open science, long-term preservation of research data is increasingly gaining in importance

  • This study aims to answer the following two research questions concerning affiliation linking: 1. In which ways is affiliation information included in the metadata pertaining to research data? If a researcher is able to record his/her affiliation to certain research institutions in the metadata of the archived research data, the research data can be more linked to the host institution(s) in question

  • The results section addresses four main topics, namely (1) an overview of the metadata extracted from DataCite for this study, (2) an analysis of the different ways in which data creators encode affiliation information in the metadata of datasets, (3) an evaluation of how DataCite performs with regard to the detection of research output related to archived datasets and (4) a follow-up analysis that examines more closely the different data repositories on the basis of larger random samples

Read more

Summary

Introduction

Van WettereData Science JournalIn the broader context of funder requirements with regard to research data and open science, long-term preservation of research data is increasingly gaining in importance. This linking problem hinges upon the completeness and quality of the metadata associated with the research data and the degree to which links between research objects are available in metadata hubs. Affiliation information can be encoded as free text, and in a structured, machine-readable format

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call