Pseudonymization of Radiology Data for Research Purposes

Rita Noumeir,Jean-Marc Lina,Alain Lemay

doi:10.1007/s10278-006-1051-4

Abstract

Medical image processing methods and algorithms, developed by researchers, need to be validated and tested. Test data would ideally be real clinical data especially that clinical data is varied and exists in large volumes. Nowadays, clinical data is accessible electronically and has important value for researchers. However, the usage of clinical data for research purposes should respect data confidentiality, patient right to privacy, and patient consent. In fact, clinical data is nominative given that it contains information about the patient such as name, age, and identification number. Evidently, clinical data needs to be de-identified to be exported to research databases. However, the same patient is usually followed during a long period of time. The disease progression and the diagnostic evolution represent extremely valuable information for researchers as well. Our objective is to build a research database from de-identified clinical data while enabling the data set to be easily incremented by exporting new pseudonymous data, acquired over a long period of time. Pseudonymization is data de-identification, such that data belonging to an individual in the clinical environment still belong to the same individual in the de-identified research version. In this paper, we explore various software architectures to enable the implementation of an imaging research database that can be incremented in time. We also evaluate their security and discuss their security pitfalls. As most imaging data accessible electronically is available with the digital imaging and communication in medicine (DICOM) standard, we propose a de-identification scheme that closely follows DICOM recommendations. Our work can be used to enable electronic health record (EHR) secondary usage such as public surveillance and research, while maintaining patient confidentiality.

Full Text