Abstract

Releasing representative data sets without compromising the data privacy has attracted increasing attention from the database community in recent years. Differential privacy is an influential privacy framework for data mining and data release without revealing sensitive information. However, existing solutions using differential privacy cannot effectively handle the release of high-dimensional data due to the increasing perturbation errors and computation complexity. To address the deficiency of existing solutions, we propose DPPro, a differentially private algorithm for high-dimensional data release via random projection to maximize utility while guaranteeing privacy. We theoretically prove that DPPro can generate synthetic data set with the similar squared Euclidean distance between high-dimensional vectors while achieving $(\epsilon,\delta)$ -differential privacy. Based on the theoretical analysis, we observed that the utility guarantees of released data depend on the projection dimension and the variance of the noise. Extensive experimental results demonstrate that DPPro substantially outperforms several state-of-the-art solutions in terms of perturbation error and privacy budget on high-dimensional data sets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.