Multidimensional grid-based clustering with local differential privacy

Nan Fu,Weiwei Ni,Haibo Hu,Sen Zhang

doi:10.1016/j.ins.2022.12.037

Abstract

Privacy-preserving clustering has received increasing research attention in recent years. Local differential privacy (LDP) is a privacy model without relying on trusted third parties. It plays a crucial role in distributed privacy-preserving clustering. Most previous methods focus on partition-based clustering (e.g., k-means), which necessitates many iterative interactions. Massive iterations cause a division of the privacy budget and increase the amount of individual noise, affecting the clustering utility. To address this issue, we turn to grid-based clustering and design the GC-LDP algorithm to balance the privacy and clustering utility with only three rounds of interactions. In GC-LDP, our core contribution is to develop a non-uniform grid division method via the coefficient of variation (CV), which can generate a grid structure approximating the global data distribution within two rounds of interactions. Besides, we designed a new perturbation mechanism to reduce the amount of individual noise injection. Further, a cell aggregation method is developed by exploring the relative density difference among grids to achieve multi-density clustering. Theoretical analysis and experiments on real-world datasets show that GC-LDP can obtain high-quality clustering results while satisfying local differential privacy.

Full Text