Abstract

Though the rise of big data in the field of occupational health offers new opportunities especially for cross-cutting research, they raise the issue of privacy and security of data, especially when linking sensitive data from the field of insurance, occupational health or compensation claims. We aimed to validate a large, blinded synthesized database developed from the CONSTANCES cohort by comparing associations between three independently selected outcomes, and various exposures. From the CONSTANCES cohort, a large synthetic dataset was constructed using the avatar method (Octopize) that is agnostic to the data primary or secondary data uses. Three main analyses of interest were chosen to compare associations between the raw and avatar dataset: risk of stroke (any stroke, and subtypes of stroke), risk of knee pain and limitations associated with knee pain. Logistic models were computed, and a qualitative comparison of paired odds ratio (OR) was made. Both raw and avatar datasets included 162,434 observations and 19 relevant variables. On the 172 paired raw/avatar OR that were computed, including stratified analyses on sex, more than 77% of the comparisons had a OR difference ≤0.5 and less than 7% had a discrepancy in the statistical significance of the associations, with a Cohen's Kappa coefficient of 0.80. This study shows the flexibility and the multiple usage of a synthetic database created with the avatar method in the particular field of occupational health, which can be shared in open access without risking re-identification and privacy issues and help bring new insights for complex phenomenon like return to work.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.