Abstract

This paper summarizes the findings of an analysis among scientific infrastructure service providers. These service providers have been evaluated in regard to their potential services for the management of publication-related research data. By conducting a desk research and an online survey, we found out that almost three quarters of all responding research data centres, archives and libraries generally store externally generated research data – what also applies to publication-related data. Almost 75% of all respondents also store and host the code of computation (the syntax of statistical analyses). If self-written software components have been used to generate research outputs, only 40% of all respondents accept these software components for storing and hosting. Eight in ten institutions also stated that they are taking specific actions for digital long-term preservation of their data. In regard to the documentation of stored and hosted research data almost 70% of all respondents claimed to use the metadata schema of the Data Documentation Initiative (DDI); Dublin Core was used by 30 percent (multiple answers were permitted). Almost two thirds also used persistent identifiers to facilitate citation of these datasets. Three in four respondents also stated to support researchers in creating metadata for their data. Application programming interfaces (APIs) for uploading or searching datasets currently have not been implemented by any of the respondents yet. Little widespread is the use of semantic technologies like RDF.

Highlights

  • Background and introductionIn the social sciences more and more researchers analyse data provided by official statistics or by specialised providers of research data

  • Experiences in other scientific areas are integrated in our suggestions for establishing data archives that are based on the complementary know-how of research data centres (RDCs) and libraries

  • Within our analyses we examined the availability of application programming interfaces (APIs), which enable automated data exchanges

Read more

Summary

Background and introduction

In the social sciences (especially economics, political science and sociology) more and more researchers analyse data provided by official statistics or by specialised providers of research data (e.g., from the ALLBUS at GESIS1 or from the SOEP at DIW Berlin). A rising number of publications in almost all scientific disciplines are based on the analysis of datasets, there are few effective ways to effectively replicate or re-examine the results of an empirical article, to verify it, or to make it available for re-utilisation and to support scholarly debates. The current situation confronts both the scientific community and scientific infrastructure service providers, like libraries and research data centres, with multiple challenges. In particular the roles and responsibilities of scientific infrastructure providers, e.g., research data centres (RDCs), for managing and operating a data archive that facilitates the replications of published research often are not clearly outlined. Our paper describes the outcome of desktop research and an online survey evaluating scientific infrastructure with respect to their potential services for the management of publicationrelated research data in the field of social sciences. Experiences in other scientific areas are integrated in our suggestions for establishing data archives that are based on the complementary know-how of research data centres (RDCs) and libraries

Why is social science research often not replicable?
The online-survey
Empirical findings
Datasets
Software
Metadata schemata and the creation of metadata
Conclusion and Discussion
Findings
15 Erratum
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call