Abstract

In this paper, we study the privacy-preserving data publishing problem in a distributed environment. The data contain sensitive information; hence, directly pooling and publishing the local data will lead to privacy leaks. To solve this problem, we propose a multiparty horizontally partitioned data publishing method under differential privacy (HPDP-DP). First, in order to make the noise level of the published data in the distributed scenario the same as in the centralized scenario, we use the infinite divisibility of the Laplace distribution to design a distributed noise addition scheme to perturb the locally shared data and use Paillier encryption to transmit the locally shared data to the semitrusted curator. Then, the semitrusted curator obtains the estimator of the covariance matrix of the aggregated data with Laplace noise and then obtains the principal components of the aggregated data and returns them to each data owner. Finally, the data owner utilizes the generative model of probabilistic principal component analysis to generate a synthetic data set for publication. We conducted experiments on different real data sets; the experimental results demonstrate that the synthetic data set released by the HPDP-DP method can maintain high utility.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call