Abstract

k-means clustering is widely used in many fields such as data mining, machine learning, and information retrieval. In many cases, users need to cooperate to perform k-means clustering tasks. How to perform clustering without revealing privacy has become a hot research topic. However, the existing k-means scheme based on secure multi-party computation cannot effectively protect the privacy of the output results. The multi-party k-means scheme based on differential privacy may lead to loss of data availability. In this article, we propose a practical protocol for k-means clustering in a collaborative manner, while protecting the privacy of each data record. Our protocol is the first to combine secure multi-party computing and differential privacy technology to train a privacy-preserving k-means clustering model. We design a novel algorithm, which is suitable for multi-party collaboration to update cluster centers without leaking data privacy. The algorithm guarantees that noise is added only once in each iteration, regardless of the number of participants. The protocol achieve the ”best of both worlds”, which can simultaneously achieves both the input privacy and the output privacy in the k-means clustering scheme. Evaluation of real data sets shows that our scheme has comparable running time compared with the k-means clustering scheme without privacy protection.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.