P2P Watch: Personal Health Information Detection in Peer-to-Peer File-Sharing Networks

Marina Sokolova,Luk Arbuckle,Elizabeth Jonker,Emilio Neri,Khaled El Emam,Sean Rose

doi:10.2196/jmir.1898

Abstract

BackgroundUsers of peer-to-peer (P2P) file-sharing networks risk the inadvertent disclosure of personal health information (PHI). In addition to potentially causing harm to the affected individuals, this can heighten the risk of data breaches for health information custodians. Automated PHI detection tools that crawl the P2P networks can identify PHI and alert custodians. While there has been previous work on the detection of personal information in electronic health records, there has been a dearth of research on the automated detection of PHI in heterogeneous user files.ObjectiveTo build a system that accurately detects PHI in files sent through P2P file-sharing networks. The system, which we call P2P Watch, uses a pipeline of text processing techniques to automatically detect PHI in files exchanged through P2P networks. P2P Watch processes unstructured texts regardless of the file format, document type, and content.MethodsWe developed P2P Watch to extract and analyze PHI in text files exchanged on P2P networks. We labeled texts as PHI if they contained identifiable information about a person (eg, name and date of birth) and specifics of the person’s health (eg, diagnosis, prescriptions, and medical procedures). We evaluated the system’s performance through its efficiency and effectiveness on 3924 files gathered from three P2P networks.ResultsP2P Watch successfully processed 3924 P2P files of unknown content. A manual examination of 1578 randomly selected files marked by the system as non-PHI confirmed that these files indeed did not contain PHI, making the false-negative detection rate equal to zero. Of 57 files marked by the system as PHI, all contained both personally identifiable information and health information: 11 files were PHI disclosures, and 46 files contained organizational materials such as unfilled insurance forms, job applications by medical professionals, and essays.ConclusionsPHI can be successfully detected in free-form textual files exchanged through P2P networks. Once the files with PHI are detected, affected individuals or data custodians can be alerted to take remedial action.

Highlights

Evidence shows that files sent through peer-to-peer (P2P) file-sharing networks can disclose an individual’s personal health information (PHI) to millions of network users
We applied peer-to-peer PHI (P2P) Watch for PHI detection in 3924 files exchanged on the three P2P networks
We have introduced P2P Watch, which detects PHI in files shared by users of P2P networks

Summary

Introduction

Evidence shows that files sent through peer-to-peer (P2P) file-sharing networks can disclose an individual’s personal health information (PHI) to millions of network users. Users of peer-to-peer (P2P) file-sharing networks risk the inadvertent disclosure of personal health information (PHI). While there has been previous work on the detection of personal information in electronic health records, there has been a dearth of research on the automated detection of PHI in heterogeneous user files. The system, which we call P2P Watch, uses a pipeline of text processing techniques to automatically detect PHI in files exchanged through P2P networks. We labeled texts as PHI if they contained identifiable information about a person (eg, name and date of birth) and specifics of the person’s health (eg, diagnosis, prescriptions, and medical procedures). Once the files with PHI are detected, affected individuals or data custodians can be alerted to take remedial action

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Medical Internet Research	Publication Date: Jul 9, 2012
Citations: 8	License type: cc-by

R Discovery Prime

R Discovery Prime

P2P Watch: Personal Health Information Detection in Peer-to-Peer File-Sharing Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Medical Internet Research

Lead the way for us

Similar Papers

The inadvertent disclosure of personal health information through peer-to-peer file sharing programs
Khaled El Emam ... Elizabeth Jonker
Journal of the American Medical Informatics Association | VOL. 17
Khaled El Emam, et. al.Khaled El Emam ... Elizabeth Jonker
01 Mar 2010
Journal of the American Medical Informatics Association | VOL. 17

From Hippocrates to HIPAA: Privacy and confidentiality in Emergency Medicine—Part I: Conceptual, moral, and legal foundations
John C Moskop ... Arthur R Derse
Annals of Emergency Medicine | VOL. 45
John C Moskop, et. al.John C Moskop ... Arthur R Derse
01 Dec 2004
Annals of Emergency Medicine | VOL. 45

New Security Regs Could Short Circuit EHRs
Erik L Goldman
Caring for the Ages | VOL. 10
Erik L GoldmanErik L Goldman
01 Dec 2009
Caring for the Ages | VOL. 10

Legal issues pertaining to the collection of sociodemographic data in emergency departments.
Haley Hrymak ... Murdoch Leeies
Academic Emergency Medicine | VOL. 30
Haley Hrymak, et. al.Haley Hrymak ... Murdoch Leeies
22 Mar 2023
Academic Emergency Medicine | VOL. 30

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

P2P Watch: Personal Health Information Detection in Peer-to-Peer File-Sharing Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Medical Internet Research