Improved angelization technique against background knowledge attack for 1:M microdata.

Razaullah Khan,Abid Khan,Adeel Anjum,Semeen Rehman,Madiha Haider Syed,Rabeeha Fazal

doi:10.7717/peerj-cs.1255

Abstract

With the advent of modern information systems, sharing Electronic Health Records (EHRs) with different organizations for better medical treatment, and analysis is beneficial for both academic as well as for business development. However, an individual's personal privacy is a big concern because of the trust issue across organizations. At the same time, the utility of the shared data that is required for its favorable use is also important. Studies show that plenty of conventional work is available where an individual has only one record in a dataset (1:1 dataset), which is not the case in many applications. In a more realistic form, an individual may have more than one record in a dataset (1:M). In this article, we highlight the high utility loss and inapplicability for the 1:M dataset of the θ-Sensitive k-Anonymity privacy model. The high utility loss and low data privacy of (p,l)-angelization, and (k,l)-diversity for the 1:M dataset. As a mitigation solution, we propose an improved (θ∗,k)-utility algorithm to preserve enhanced privacy and utility of the anonymized 1:M dataset. Experiments on the real-world dataset reveal that the proposed approach outperforms its counterpart, in terms of utility and privacy for the 1:M dataset.

Full Text