Abstract

BackgroundMulti-site health sciences research is becoming more common, as it enables investigation of rare outcomes and diseases and new healthcare innovations. Multi-site research usually involves the transfer of large amounts of research data between collaborators, which increases the potential for accidental disclosures of protected health information (PHI). Standard protocols for preventing release of PHI are extremely vulnerable to human error, particularly when the shared data sets are large.MethodsTo address this problem, we developed an automated program (SAS macro) to identify possible PHI in research data before it is transferred between research sites. The macro reviews all data in a designated directory to identify suspicious variable names and data patterns. The macro looks for variables that may contain personal identifiers such as medical record numbers and social security numbers. In addition, the macro identifies dates and numbers that may identify people who belong to small groups, who may be identifiable even in the absences of traditional identifiers.ResultsEvaluation of the macro on 100 sample research data sets indicated a recall of 0.98 and precision of 0.81.ConclusionsWhen implemented consistently, the macro has the potential to streamline the PHI review process and significantly reduce accidental PHI disclosures.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.