Abstract

BackgroundElectronic medical records (EMRs) represent a potentially rich source of health information for research but the free-text in EMRs often contains identifying information. While de-identification tools have been developed for free-text, none have been developed or tested for the full range of primary care EMR dataMethodsWe used deid open source de-identification software and modified it for an Ontario context for use on primary care EMR data. We developed the modified program on a training set of 1000 free-text records from one group practice and then tested it on two validation sets from a random sample of 700 free-text EMR records from 17 different physicians from 7 different practices in 5 different cities and 500 free-text records from a group practice that was in a different city than the group practice that was used for the training set. We measured the sensitivity/recall, precision, specificity, accuracy and F-measure of the modified tool against manually tagged free-text records to remove patient and physician names, locations, addresses, medical record, health card and telephone numbers.ResultsWe found that the modified training program performed with a sensitivity of 88.3%, specificity of 91.4%, precision of 91.3%, accuracy of 89.9% and F-measure of 0.90. The validations sets had sensitivities of 86.7% and 80.2%, specificities of 91.4% and 87.7%, precisions of 91.1% and 87.4%, accuracies of 89.0% and 83.8% and F-measures of 0.89 and 0.84 for the first and second validation sets respectively.ConclusionThe deid program can be modified to reasonably accurately de-identify free-text primary care EMR records while preserving clinical content.

Highlights

  • Electronic medical records (EMRs) represent a potentially rich source of health information for research but the free-text in EMRs often contains identifying information

  • At the Institute for Clinical Evaluative Sciences (ICES) we have developed an Electronic Medical Record Administrative data Linked Database (EMRALD) using data from family physician EMRs

  • Even though ICES does not release any individual level information, a free-text de-identification tool is needed in order to further enhance privacy measures through all steps of in-house EMR data analysis

Read more

Summary

Introduction

Electronic medical records (EMRs) represent a potentially rich source of health information for research but the free-text in EMRs often contains identifying information. ICES is an independent, not-for-profit health services research organization with a unique designation as a 'prescribed entity' in Section 45(1) of the Personal Health Information Protection Act (PHIPA), Ontario's privacy legislation[4]. This means that ICES has policies and procedures in place to protect the privacy and confidentiality of patients[5] as required by the Act (s.45(3)), which have been reviewed and approved by the Information and Privacy Commissioner of Ontario. Even though ICES does not release any individual level information, a free-text de-identification tool is needed in order to further enhance privacy measures through all steps of in-house EMR data analysis

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.