Range query is the hot topic of the privacy-preserving data publishing. To preserve privacy, the large range query means more accumulate noise will be injected into the input data. This study presents a research on differential privacy for range query via Haar wavelet transform and Gaussian mechanism. First, the noise injected into the input data via Laplace mechanism is analyzed, and we conclude that it is difficult to judge the level of privacy protection based on the Haar wavelet transform and Laplace mechanism for range query because the sum of independent random Laplace variables is not a variable of a Laplace distribution. Second, the method of injecting noise into Haar wavelet coefficients via Gaussian mechanism is proposed in this study. Finally, the maximum variance for any range query under the framework of Haar wavelet transform and Gaussian mechanism is given. The analysis shows that using Haar wavelet transform and Gaussian mechanism, we can preserve the differential privacy for each input data and any range query, and the variance of noise is far less than that just using the Gaussian mechanism. In an experimental study on the dataset age extracted from IPUM's census data of the United States, we confirm that the proposed mechanism has much smaller maximum variance of noises than the Gaussian mechanism for range-count queries.
Read full abstract