Abstract

Probability density estimation is a fundamental task in data analysis that can estimate the unobservable underlying probability density function from the observed data. However, the data used for density estimation may contain sensitive information, and the public of original data will compromise individuals’ privacy. To address this problem, we in this paper propose a private parametric probability density estimation mechanism, called PrivGMM. It provides strong privacy guarantees locally (e.g., on personal computers or mobile phones) and efficiently (i.e., computation cost is small) for users. Meanwhile, it provides an accurate estimation of parameters of the probability density model for data collectors. Specifically, in a local setting, each user adds noise to his/her original data, given the constraint of local differential privacy. On the server side, we employ the Gaussian Mixture Model, which is a popular model to approximate distributions. To reduce the effect of noise, we formulate the parametric estimation problem with a multi-layer latent variables structure, and utilize Expectation-Maximization algorithm to solve the Gaussian Mixture Model. Experiments in real datasets validate that our mechanism outperforms the state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call