Abstract
The objective of this project was to determine the capability of a federated analysis approach using DataSHIELD to maintain the level of results of a classical centralized analysis in a real-world setting. This research was carried out on an anonymous synthetic longitudinal real-world oncology cohort randomly splitted in three local databases, mimicking three healthcare organizations, stored in a federated data platform integrating DataSHIELD. No individual data transfer, statistics were calculated simultaneously but in parallel within each healthcare organization and only summary statistics (aggregates) were provided back to the federated data analyst. Descriptive statistics, survival analysis, regression models and correlation were first performed on the centralized approach and then reproduced on the federated approach. The results were then compared between the two approaches. The cohort was splitted in three samples (N1 = 157 patients, N2 = 94 and N3 = 64), 11 derived variables and four types of analyses were generated. All analyses were successfully reproduced using DataSHIELD, except for one descriptive variable due to data disclosure limitation in the federated environment, showing the good capability of DataSHIELD. For descriptive statistics, exactly equivalent results were found for the federated and centralized approaches, except some differences for position measures. Estimates of univariate regression models were similar, with a loss of accuracy observed for multivariate models due to source database variability. Our project showed a practical implementation and use case of a real-world federated approach using DataSHIELD. The capability and accuracy of common data manipulation and analysis were satisfying, and the flexibility of the tool enabled the production of a variety of analyses while preserving the privacy of individual data. The DataSHIELD forum was also a practical source of information and support. In order to find the right balance between privacy and accuracy of the analysis, set-up of privacy requirements should be established prior to the start of the analysis, as well as a data quality review of the participating healthcare organization.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.