Abstract

Today, there are many sources of data, such as IoT devices, that produce a massive amount of data, particularly in the healthcare industry. This microdata needs to be published, and shared for medical research purposes, data analysis, mining, learning analytics tasks, and the decision-making process. But this published data contains sensitive and private information for individuals, and if this microdata is published in its original format, the privacy of individuals may be disclosed, which puts the individuals at risk, especially if an adversary has strong background knowledge about the target individual. Owning multiple records and multiple sensitive attributes (MSA) for an individual can lead to new privacy leakages or disclosure. So, the fundamental issue is how to protect the privacy of 1:M with the MSA dataset using anonymization techniques and methods, as well as how to balance utility and privacy, for this data while reducing information loss and misuse. The objective of this paper is to use different methods and different anonymization algorithms, like the 1:m-generalization algorithm and Mondrian, and compare them to show which of them maintains data privacy and high utility of analysis results at the same time. From this comparison, we found that the m-generalization algorithm and the (p, k) angelization method perform well in terms of information loss and data utility compared to the other remaining methods and algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call