Abstract

Privacy-preserving data publishing (PPDP) is an essential prerequisite for data-driven AI technologies, (such as data mining, machine learning, deep learning, etc.) to extract knowledge from data safely and legally. It has, as it should be, been studied and explored as a hot topic in the last decade. However, existing privacy protection mechanisms cannot take into account the following three aspects: preventing background attack, maximizing data availability, and resisting sensitive information mining. In this work, we propose a novel privacy-preserving data publishing framework, which protects privacy by releasing simulated data instead of real data. It is explored for generating data similar to the distribution of the real data by using Bayesian network. It consists of two ingredients. First, we transform the problem of data publication into the generation process of a Bayesian network, and correspondingly, the problem of privacy leakage is transformed into one kind of Bayesian inference attack. Second, we propose a re-anonymity framework, named (d, L)-injection, which flexibly resolves the impact of increased privacy protection strength on data availability. In addition, we transplant three classical privacy-preserving strategies to the generated Bayesian network, and demonstrates the effectiveness of the method through three public data sets from multiple application domains.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.