Abstract
In this paper, we give a modified gradient EM algorithm; it can protect the privacy of sensitive data by adding discrete Gaussian mechanism noise. Specifically, it makes the high-dimensional data easier to process mainly by scaling, truncating, noise multiplication, and smoothing steps on the data. Since the variance of discrete Gaussian is smaller than that of the continuous Gaussian, the difference privacy of data can be guaranteed more effectively by adding the noise of the discrete Gaussian mechanism. Finally, the standard gradient EM algorithm, clipped algorithm, and our algorithm (DG-EM) are compared with the GMM model. The experiments show that our algorithm can effectively protect high-dimensional sensitive data.
Highlights
Big data have spread to every field and organization in our society, generating large amounts of personal data every day, which people use and analyse to enable the rapid development of society and technology
We introduce our model, namely, differential privacy discrete Gaussian EM (Gradient) algorithm (DG-EM), and the relevant statistical guarantee theorem
We will refer to the gradient EM algorithm as EM, which will serve as a nonprivate baseline method. e other is the clipped differential private EM algorithm, which we still refer to as clipped [20], which will serve as our privacy baseline approach
Summary
Big data have spread to every field and organization in our society, generating large amounts of personal data every day, which people use and analyse to enable the rapid development of society and technology. Until Balakrishnan et al [4] gave the statistical guarantee of EM algorithm, Wang et al [3] gave the guarantee of gradient EM algorithm based on it and extended it to the data privacy protection theory. Just like most scholars, Gaussian noise with continuous distribution is added to the data, while in practice, the data output queries are often discrete, such as the number of records in the database that meets certain conditions. For this reason, Canonne et al [5] proposed to use a discrete Gaussian mechanism to add discrete Gaussian noise to the data and to ensure that it has the same excellent accuracy as adding continuous Gaussian noise
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.