Abstract

Open access to data, as a core principle of open science, is predicated on assumptions that scientific data can be reused by other researchers. We test those assumptions by asking where scientists find reusable data, how they reuse those data, and how they interpret data they did not collect themselves. By conducting a qualitative meta-analysis of evidence on two long-term, distributed, interdisciplinary consortia, we found that scientists frequently sought data from public collections and from other researchers for comparative purposes such as “ground-truthing” and calibration. When they sought others’ data for reanalysis or for combining with their own data, which was relatively rare, most preferred to collaborate with the data creators. We propose a typology of data reuses ranging from comparative to integrative. Comparative data reuse requires interactional expertise, which involves knowing enough about the data to assess their quality and value for a specific comparison such as calibrating an instrument in a lab experiment. Integrative reuse requires contributory expertise, which involves the ability to perform the action, such as reusing data in a new experiment. Data integration requires more specialized scientific knowledge and deeper levels of epistemic trust in the knowledge products. Metadata, ontologies, and other forms of curation benefit interpretation for any kind of data reuse. Based on these findings, we theorize the data creators’ advantage, that those who create data have intimate and tacit knowledge that can be used as barter to form collaborations for mutual advantage. Data reuse is a process that occurs within knowledge infrastructures that evolve over time, encompassing expertise, trust, communities, technologies, policies, resources, and institutions. Keywords: data, science, reuse, biomedicine, environmental sciences, open science, data practices, science policy

Highlights

  • Introduction and Problem StatementScientific practice and public policy continue to move toward open access to publications, data, software, code, and other research products

  • I think it's more difficult for people to get in there and make sense of this. Researchers in both Center for Embedded Networked Sensing (CENS) and DataFace reused data created by other researchers, by government agencies, and by other trusted sources

  • We develop the theoretical framework for types of data reuses and the data creators’ advantage

Read more

Summary

Introduction

Introduction and Problem StatementScientific practice and public policy continue to move toward open access to publications, data, software, code, and other research products. To provide open access to research data, stakeholders must build digital archives, populate those archives, and maintain them. While all of these costly public investments are necessary for data reuse, they are not sufficient to ensure that those data are useful for further research, nor that those assets will be reused. An important question for the sciences and for public policy is to ask what kinds of data reuse are made possible by access to public data archives and what kinds are not. Answers to these questions can guide the design of digital archives, policies for data governance, and public policy for open access to data When scientists seek data from sources beyond their own laboratories and current collaborations, under what conditions do public data suffice? When do scientists pursue interpersonal contact for further expertise about those data and their contexts of origin? How does data reuse vary by research domain, purposes for potential reuse, access to data creators, and time period? Answers to these questions can guide the design of digital archives, policies for data governance, and public policy for open access to data

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.