Attribute Disclosure Attacks Research Articles

Simulating electronic health record data offers an opportunity to resolve the tension between data sharing and patient privacy. Recent techniques based on generative adversarial networks have shown promise but neglect the temporal aspect of healthcare. We introduce a generative framework for simulating the trajectory of patients' diagnoses and measures to evaluate utility and privacy. The framework simulates date-stamped diagnosis sequences based on a 2-stage process that 1) sequentially extracts temporal patterns from clinical visits and 2) generates synthetic data conditioned on the learned patterns. We designed 3 utility measures to characterize the extent to which the framework maintains feature correlations and temporal patterns in clinical events. We evaluated the framework with billing codes, represented as phenome-wide association study codes (phecodes), from over 500000 Vanderbilt University Medical Center electronic health records. We further assessed the privacy risks based on membership inference and attribute disclosure attacks. The simulated temporal sequences exhibited similar characteristics to real sequences on the utility measures. Notably, diagnosis prediction models based on real versus synthetic temporal data exhibited an average relative difference in area under the ROC curve of 1.6% with standard deviation of 3.8% for 1276 phecodes. Additionally, the relative difference in the mean occurrence age and time between visits were 4.9% and 4.2%, respectively. The privacy risks in synthetic data, with respect to the membership and attribute inference were negligible. This investigation indicates that temporal diagnosis code sequences can be simulated in a manner that provides utility and respects privacy.

Read full abstract

Data mining is the process of analyzing data. Data Privacy is collection of data and dissemination of data. Privacy issues arise in different area such as health care, intellectual property, biological data, financial transaction etc. It is very difficult to protect the data when there is transfer of data. Sensitive information must be protected. There are two kinds of major attacks against privacy namely record linkage and attribute linkage attacks. Research have proposed some methods namely k-anonymity, l-diversity, t-closeness for data privacy. K-anonymity method preserves the privacy against record linkage attack alone. It is unable to prevent address attribute linkage attack. l-diversity method overcomes the drawback of k-anonymity method. But it fails to prevent identity disclosure attack and attribute disclosure attack. t-closeness method preserves the privacy against attribute linkage attack but not identity disclosure attack. A proposed method used to preserve the privacy of individuals' sensitive data from record and attribute linkage attacks. In the proposed method, privacy preservation is achieved through generalization by setting range values and through record elimination. A proposed method overcomes the drawback of both record linkage attack and attribute linkage attack

Read full abstract

Attribute Disclosure Attacks Research Articles

Related Topics

Articles published on Attribute Disclosure Attacks

Flexible adversary disclosure risk measure for identity and attribute disclosure attacks

SynTEG: a framework for temporal structured electronic health data simulation.

Privacy Preserving Anonymization Schemes-On Transaction Data Publishing

Anonymizied Approach to Preserve Privacy of Published Data Through Record Elimination

A New Method for Preserving Privacy in Data Publishing Against Attribute and Identity Disclosure Risk

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Attribute Disclosure Attacks Research Articles

Related Topics

Articles published on Attribute Disclosure Attacks

Flexible adversary disclosure risk measure for identity and attribute disclosure attacks

SynTEG: a framework for temporal structured electronic health data simulation.

Privacy Preserving Anonymization Schemes-On Transaction Data Publishing

Anonymizied Approach to Preserve Privacy of Published Data Through Record Elimination

A New Method for Preserving Privacy in Data Publishing Against Attribute and Identity Disclosure Risk