Communication-Efficient Privacy-Preserving Clustering

Geetha Jagannathan ,Rebecca N Wright ,Krishnan Pillaipakkamnatt ,Daryl Umano

doi:10.5555/1747335.1747336

Abstract

The ability to store vast quantities of data and the emergence of high speed networking have led to intense interest in distributed data mining. However, privacy concerns, as well as regulations, often prevent the sharing of data between multiple parties. Privacy-preserving distributed data mining allows the cooperative computation of data mining algorithms without requiring the participating organizations to reveal their individual data items to each other. This paper makes several contributions. First, we present a simple, deterministic, I/O-efficient kclustering algorithm that was designed with the goal of enabling an efficient privacy-preserving version of the algorithm. Our algorithm examines each item in the database only once and uses only sequential access to the data. Our experiments show that this algorithm produces cluster centers that are, on average, more accurate than the ones produced by the well known iterative k-means algorithm, and compares well against BIRCH. Second, we present a distributed privacy-preserving protocol for k-clustering based on our new clustering algorithm. The protocol applies to databases that are horizontally partitioned between two parties. The participants of the protocol learn only the final cluster centers on completion of the protocol. Unlike most of the earlier results in privacy-preserving clustering, our protocol does not reveal intermediate candidate cluster centers. The protocol is also efficient in terms of communication and does not depend on the size of the database. Although there have been other clustering algorithms that improve on the k-means algorithm, ours is the first for which a communication efficient cryptographic privacy-preserving protocol has been demonstrated.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Communication-Efficient Privacy-Preserving Clustering

Abstract

Talk to us

Similar Papers

More From: Transactions on Data Privacy

Lead the way for us

Journal: Transactions on Data Privacy	Publication Date: Apr 1, 2010
Citations: 53

Similar Papers

Performance analysis of privacy preserving distributed data mining based on cryptographic techniques
Venkatesh Kumar Marimuthu ... C Lakshmi
-
Venkatesh Kumar Marimuthu, et. al.Venkatesh Kumar Marimuthu ... C Lakshmi
11 Feb 2021
11 Feb 2021

Privacy-Preserving Hierarchical-k-Means Clustering on Horizontally Partitioned Data
Anrong Xue ... Handa Ma
International Journal of Distributed Sensor Networks | VOL. 5
Anrong Xue, et. al.Anrong Xue ... Handa Ma
01 Jan 2009
International Journal of Distributed Sensor Networks | VOL. 5

Privacy preserving distributed data mining based on secure multi-party computation
Jun Liu ... Nirwan Ansari
Computer Communications | VOL. 153
Jun Liu, et. al.Jun Liu ... Nirwan Ansari
08 Feb 2020
Computer Communications | VOL. 153

Privacy-Preserving Distributed Data Mining Techniques: A Survey
N Subhash ... V Baby
International Journal of Computer Applications | VOL. 143
N Subhash, et. al.N Subhash ... V Baby
17 Jun 2016
International Journal of Computer Applications | VOL. 143

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Communication-Efficient Privacy-Preserving Clustering

Abstract

Talk to us

Similar Papers

More From: Transactions on Data Privacy