Abstract
Statistical partitioning and publishing is commonly used in location-based big data services to address queries such as the number of points of interest, available vehicles, traffic flows, infected patients, etc., within a certain range. Adding noise perturbation to the location-based statistical data according to the differential privacy model can reduce various risks caused by location privacy leakage while keeping the statistical characteristics of the published data. The traditional statistical partitioning and publishing methods realize the decomposition and indexing of 2D space from top to bottom. However, they can easily cause the over-partitioning or under-partitioning phenomenon, and therefore need multiple times of data scan. This paper proposes a grid clustering and differential privacy protection method for location-based statistical big data publishing scenarios. We implement location-based big data statistics in units of equal-sized grids and perform density classification on uniformly distributed grids by discrete wavelet transform. A bottom-up grid clustering algorithm is designed to perform on the blank and the uniform grids of the same density level based on neighborhood similarity. The Laplacian noise is incorporated into the clustering results according to the differential privacy model to form the published statistics. Experimental comparison of the real-world datasets manifests that the grid clustering and differential privacy publishing method proposed in this paper is superior to other existing partition publishing methods in terms of range querying accuracy and algorithm operating efficiency.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.