Abstract

BackgroundIn clinical research, important variables may be collected from multiple data sources. Physical pooling of patient-level data from multiple sources often raises several challenges, including proper protection of patient privacy and proprietary interests. We previously developed an SAS-based package to perform distributed regression—a suite of privacy-protecting methods that perform multivariable-adjusted regression analysis using only summary-level information—with horizontally partitioned data, a setting where distinct cohorts of patients are available from different data sources. We integrated the package with PopMedNet, an open-source file transfer software, to facilitate secure file transfer between the analysis center and the data-contributing sites. The feasibility of using PopMedNet to facilitate distributed regression analysis (DRA) with vertically partitioned data, a setting where the data attributes from a cohort of patients are available from different data sources, was unknown.ObjectiveThe objective of the study was to describe the feasibility of using PopMedNet and enhancements to PopMedNet to facilitate automatable vertical DRA (vDRA) in real-world settings.MethodsWe gathered the statistical and informatic requirements of using PopMedNet to facilitate automatable vDRA. We enhanced PopMedNet based on these requirements to improve its technical capability to support vDRA.ResultsPopMedNet can enable automatable vDRA. We identified and implemented two enhancements to PopMedNet that improved its technical capability to perform automatable vDRA in real-world settings. The first was the ability to simultaneously upload and download multiple files, and the second was the ability to directly transfer summary-level information between the data-contributing sites without a third-party analysis center.ConclusionsPopMedNet can be used to facilitate automatable vDRA to protect patient privacy and support clinical research in real-world settings.

Highlights

  • Researchers often have to pool data from multiple sources for their studies

  • We have previously developed an SAS-based distributed regression analysis (DRA) package integrated with PopMedNet, an open-source file transfer software [14,15,16,17], to perform automatable distributed regression within horizontally partitioned data environments

  • We describe the feasibility of using PopMedNet and enhancements to PopMedNet to facilitate automatable vertical DRA (vDRA) to protect patient privacy and support clinical research in real-world settings

Read more

Summary

Introduction

Researchers often have to pool data from multiple sources for their studies. One common scenario is to combine data from multiple distinct cohorts of patients to achieve sufficient statistical power, especially in studies where the exposure or outcome of interest is rare. It is possible to relativize certain data attributes such as dates (eg, by setting the cohort entry date as time zero and converting all dates to numerical values relative to the time zero), perturb data attributes that may be used to reidentify patients (eg, rare covariates or laboratory values), or encrypt the deidentified patient level These data manipulation techniques may not be feasible in certain studies and do not always guarantee adequate levels of privacy protection, which may deter collaboration and data sharing. We previously developed an SAS-based package to perform distributed regression—a suite of privacy-protecting methods that perform multivariable-adjusted regression analysis using only summary-level information—with horizontally partitioned data, a setting where distinct cohorts of patients are available from different data sources. Conclusions: PopMedNet can be used to facilitate automatable vDRA to protect patient privacy and support clinical research in real-world settings

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call