Abstract
Density peaks clustering is a novel and efficient density-based clustering algorithm. However, the problem of the sensitive information leakage and the associated security risk with the applications of clustering methods is rarely considered. To address the problem, we proposed differential privacy-preserving density peaks' clustering based on the shared near neighbors similarity method in this paper. First, the Euclidean distance and the shared near neighbors similarity were combined to define the local density of a sample, and the Laplace noise was added to the local density and the shortest distance to protect privacy. Second, the process of cluster center selection was optimized to select the initial cluster centers based on the neighborhood information. Finally, each sample was assigned to the cluster as its nearest neighbor with higher local density. The experimental results on both the UCI and synthetic datasets show that compared with other algorithms, our method more effectively protects the data privacy and improves the quality of the clustering results.
Highlights
The rapid development of information technology and network technology has brought people from the traditional Internet era into the big data era, the artificial intelligence era, and the IoT era
2) Differential privacy-preserving density peaks clustering based on shared near neighbors similarity (DP-DPCSNNS) is proposed
Differential privacy-preserving density peaks clustering based on shared near neighbors similarity, which is denoted as DP-DPCSNNS, was proposed in this paper
Summary
The rapid development of information technology and network technology has brought people from the traditional Internet era into the big data era, the artificial intelligence era, and the IoT era. L. Sun et al.: Differential Privacy-Preserving Density Peaks Clustering Based on Shared Near Neighbors Similarity equivalence classes include K -anonymity [4], l-diversity [5] and t-closeness [6]. Sun et al.: Differential Privacy-Preserving Density Peaks Clustering Based on Shared Near Neighbors Similarity equivalence classes include K -anonymity [4], l-diversity [5] and t-closeness [6] They do not provide sufficient security and must be continuously improved according to emerging attack models. The shared near neighbors similarity and Euclidean distance are combined to calculate the local density of the samples, which effectively avoids selecting parameter dc in the DPC algorithm. 2) Differential privacy-preserving density peaks clustering based on shared near neighbors similarity (DP-DPCSNNS) is proposed.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.