Abstract
Data-sharing is encouraged to fulfill the ethical responsibility to transform research data into public health knowledge, but data sharing carries risks of improper disclosure and potential harm from release of individually identifiable data. The study objective was to develop and implement a novel method for scientific collaboration and data sharing which distributes the analytic burden while protecting patient privacy. A procedure was developed where in an investigator who is external to an analytic coordinating center (ACC) can conduct original research following a protocol governed by a Publications and Presentations (P&P) Committee. The collaborating investigator submits a study proposal and, if approved, develops the analytic specifications using existing data dictionaries and templates. An original data set is prepared according to the specifications and the external investigator is provided with a complete but de-identified and shuffled data set which retains all key data fields but which obfuscates individually identifiable data and patterns; this" scrambled data set" provides a "sandbox" for the external investigator to develop and test analytic code for analyses. The analytic code is then run against the original data at the ACC to generate output which is used by the external investigator in preparing a manuscript for journal submission. The method has been successfully used with collaborators to produce many published papers and conference reports. By distributing the analytic burden, this method can facilitate collaboration and expand analytic capacity, resulting in more science for less money.
Highlights
Data-sharing is encouraged to fulfill the ethical responsibility to transform research data into public health knowledge, but data sharing carries risks of improper disclosure and potential harm from release of individually identifiable data
An external investigator with an idea for a study based on analytic coordinating center (ACC) data submits a written proposal to the Publications and Presentations (P&P)
The investigator is responsible for adhering to ACC policies and guidelines and for producing the final manuscript for publication
Summary
Data-sharing is encouraged to fulfill the ethical responsibility to transform research data into public health knowledge, but data sharing carries risks of improper disclosure and potential harm from release of individually identifiable data. Methods: The study objective was to develop and implement a novel method for scientific collaboration and data sharing which distributes the analytic burden while protecting patient privacy. A procedure was developed where in an investigator who is external to an analytic coordinating center (ACC) can conduct original research following a protocol governed by a Publications and Presentations (P&P) Committee. An original data set is prepared according to the specifications and the external investigator is provided with a complete but de-identified and shuffled data set which retains all key data fields but which obfuscates individually identifiable data and patterns; this “scrambled data set” provides a “sandbox” for the external investigator to develop and test analytic code for analyses. The analytic code is run against the original data at the ACC to generate output which is used by the external investigator in preparing a manuscript for journal submission. Conclusion: By distributing the analytic burden, this method can facilitate collaboration and expand analytic capacity, resulting in more science for less money
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Information security and computer fraud
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.