A DICOM dataset for evaluation of medical image de-identification

Michael Rutherford,Betty Levine,Kirk Smith,Ulrike Wagner,John Freyman,Quasar Jarosz,William Bennett,Lawrence Tarbox,Phil Farmer,Keyvan Farahani,Fred Prior,Geri Blake,Seong K Mun

doi:10.1038/s41597-021-00967-y

Abstract

We developed a DICOM dataset that can be used to evaluate the performance of de-identification algorithms. DICOM objects (a total of 1,693 CT, MRI, PET, and digital X-ray images) were selected from datasets published in the Cancer Imaging Archive (TCIA). Synthetic Protected Health Information (PHI) was generated and inserted into selected DICOM Attributes to mimic typical clinical imaging exams. The DICOM Standard and TCIA curation audit logs guided the insertion of synthetic PHI into standard and non-standard DICOM data elements. A TCIA curation team tested the utility of the evaluation dataset. With this publication, the evaluation dataset (containing synthetic PHI) and de-identified evaluation dataset (the result of TCIA curation) are released on TCIA in advance of a competition, sponsored by the National Cancer Institute (NCI), for algorithmic de-identification of medical image datasets. The competition will use a much larger evaluation dataset constructed in the same manner. This paper describes the creation of the evaluation datasets and guidelines for their use.

Highlights

Background & SummaryOpen access or shared research data must comply with the Health Insurance Portability and Accountability Act (HIPAA) regulations that govern patient privacy
We developed a DICOM dataset that can be used to evaluate the performance of de-identification algorithms
This paper describes the creation of the evaluation datasets and guidelines for their use

Summary

Background & Summary

Open access or shared research data must comply with the Health Insurance Portability and Accountability Act (HIPAA) regulations that govern patient privacy These regulations require the de-identification or removal of protected health information (PHI) and other personally identifiable information (PII) from datasets before they can be made publicly available. TCIA has developed image de-identification tools and protocols that combine automated and manual de-identification processes. Automated image de-identification algorithms require evaluation before they can be deployed to process data for open access. This evaluation requires a robust dataset that can be used as a part of assessing image de-identification algorithms. Researchers to test their de-identification algorithms and promote standardized procedures for validating automated de-identification

Methods

Code availability

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific Data	Publication Date: Jul 16, 2021
Citations: 19	License type: open-access

R Discovery Prime

R Discovery Prime

A DICOM dataset for evaluation of medical image de-identification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Data

Lead the way for us

Similar Papers

Radiomics Prediction of Radiation Treatment Outcomes in Oropharyngeal Cancer: A Clinical and Image Repository in Concert with the Cancer Imaging Archive (TCIA)
H Elhalawani ... C.D Fuller
International Journal of Radiation Oncology*Biology*Physics | VOL. 102
H Elhalawani, et. al.H Elhalawani ... C.D Fuller
20 Oct 2018
International Journal of Radiation Oncology*Biology*Physics | VOL. 102

Abstract 6579: Accelerating de-identification of images with cloud services to support data sharing in cancer research
Benjamin P Kopchick ... Ulrike Wagner
Cancer Research | VOL. 83
Benjamin P Kopchick, et. al.Benjamin P Kopchick ... Ulrike Wagner
04 Apr 2023
Cancer Research | VOL. 83

NCI Imaging Data Commons
A Fedorov ... R Kikinis
International Journal of Radiation Oncology*Biology*Physics | VOL. 111
A Fedorov, et. al.A Fedorov ... R Kikinis
22 Oct 2021
International Journal of Radiation Oncology*Biology*Physics | VOL. 111

The public cancer radiology imaging collections of The Cancer Imaging Archive
Fred Prior ... William Bennett
Scientific data | VOL. 4
Fred Prior, et. al.Fred Prior ... William Bennett
19 Sep 2017
Scientific data | VOL. 4

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A DICOM dataset for evaluation of medical image de-identification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Data