Publishing descriptions of non-public clinical datasets: proposed guidance for researchers, repositories, editors and funding organisations.

Iain Hrynaszkiewicz,Susanna-Assunta Sansone,Andrew L Hufton,Varsha Khodiyar

doi:10.1186/s41073-016-0015-6

Iain Hrynaszkiewicz, Susanna-Assunta Sansone + Show 2 more

Open Access

https://doi.org/10.1186/s41073-016-0015-6

Copy DOI

Abstract

Sharing of experimental clinical research data usually happens between individuals or research groups rather than via public repositories, in part due to the need to protect research participant privacy. This approach to data sharing makes it difficult to connect journal articles with their underlying datasets and is often insufficient for ensuring access to data in the long term. Voluntary data sharing services such as the Yale Open Data Access (YODA) and Clinical Study Data Request (CSDR) projects have increased accessibility to clinical datasets for secondary uses while protecting patient privacy and the legitimacy of secondary analyses but these resources are generally disconnected from journal articles—where researchers typically search for reliable information to inform future research. New scholarly journal and article types dedicated to increasing accessibility of research data have emerged in recent years and, in general, journals are developing stronger links with data repositories. There is a need for increased collaboration between journals, data repositories, researchers, funders, and voluntary data sharing services to increase the visibility and reliability of clinical research. Using the journal Scientific Data as a case study, we propose and show examples of changes to the format and peer-review process for journal articles to more robustly link them to data that are only available on request. We also propose additional features for data repositories to better accommodate non-public clinical datasets, including Data Use Agreements (DUAs).

Highlights

Open access to research data that can be understood and reused by others is a means to further scientific progress and publish more reliable and reproducible research [1, 2]
We use the term “non-public clinical datasets” to mean datasets that have been generated through experimental clinical research, such as clinical trials, and which are not openly accessible, but are available on request
Throughout this paper, Scientific Data is used as a case study of how these guidelines will work in practice and we propose how similar approaches could be taken by other journals that consider manuscripts describing clinical or other data that cannot be publicly available

Summary

Introduction

Open access to research data that can be understood and reused by others is a means to further scientific progress and publish more reliable and reproducible research [1, 2]. Clinical research data often include information that could potentially identify individuals, meaning datasets must be anonymised prior to being shared beyond the study for which the data were originally collected. Guidelines and processes for anonymisation of clinical data exist [3, 4], publication of freely available clinical datasets (such as [5]) remains uncommon. As open access to clinical datasets is often unfeasible, a more felicitous and pragmatic approach may be needed. We use the term “non-public clinical datasets” to mean datasets that have been generated through experimental clinical research, such as clinical trials, and which are not openly accessible, but are available on request. Disease-specific or epidemiologic cohort databases, and electronic health records, which can be continually updated and held by an institution, should not be excluded from data sharing but may require specific additional guidance not covered in this paper

Methods

Findings

Conclusion