Abstract

Big data has become omnipresent and crucial for many application domains. Big data makes reference to the explosive quantity of data generated in today’s society that might contain personally identifiable information (PII). That’s why the challenge from the point of view of data privacy is one of the major hurdles for the application of big data. In that situation, several techniques were exposed in order to ensure privacy in big data including generalization, randomization and cryptographic techniques as well. It is well known that there exist two main types of attributes in the literature, quasi identifier and sensitive attributes. In this paper, we are going to focus on quasi identifier attributes. Over the years, k-anonymity has been treated with great interest as an anonymization technique ensuring privacy in big data when we are dealing with quasi identifier attributes. Despite the fact that many algorithms of k-anonymity have been proposed, most of them admit that the threshold k of k-anonymity has to be known before anonymizing the data set. Here, a novel way in applying k-anonymity for quasi identifier attributes is presented. It’s a new algorithm called “k-anonymity without prior value of the threshold k”. Our proposed algorithm was experimentally evaluated using a test table of quasi identifier attributes. Furthermore, we highlight all the steps of our proposed algorithm with detailed comments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call