Abstract

With the advent of the era of big data, data privacy protection has become a valuable topic in the field of data publication. Unfortunately, traditional methods of privacy protection, k-anonymity, and its extensions are not absolutely secure as an adversary with background knowledge can determine the owner of a record. The emergence of differential privacy provides a reasonable alternative for privacy security, but the existing solutions ignore the correlation between sensitive attributes and other attributes. In this paper, we propose a new differential privacy scheme based on quasi-identifier classification for big data publication (DP-QIC). It is a new data publishing scheme based on the obfuscation of attribute correlation. We innovatively present quasi-identifier classification based on sensitive attributes and the privacy ratio for evaluating the data set vulnerability. DP-QIC achieves data privacy-protecting through four steps: data collection, grouping and shuffling, generalization, merging, and noise adding, which retains the overall statistical characteristics of the data set. Moreover, the exponential mechanism and the Laplace mechanism are integrated to ensure higher flexibility and a stronger level of privacy protection, so DP-QIC can be used for privacy processing of different data groups in future development. Finally, we have compared the performance of our scheme with the other two famous schemes in the industry. Experimental results demonstrate that DP-QIC has obvious advantages in data utility, privacy protection, and processing efficiency.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.