Data clustering is the unsupervised classification of data records into groups. As one of the steps in data analysis, it has been widely researched and applied in practical life, such as pattern recognition, image processing, information retrieval, geography, and marketing. In addition, the rapid increase of data volume in recent years poses a huge challenge for resource-constrained data owners to perform computation on their data. This leads to a trend that users authorize the cloud to perform computation on stored data, such as keyword search, equality test, and outsourced data clustering. In outsourced data clustering, the cloud classifies users’ data into groups according to their similarities. Considering the sensitive information in outsourced data and multiple data owners in practical application, it is necessary to develop a privacy-preserving outsourced clustering scheme under multiple keys. Recently, Rong et al. proposed a privacy-preserving outsourced k-means clustering scheme under multiple keys. However, in their scheme, the assistant server (AS) is able to extract the ratio of two underlying data records, and key management server (KMS) can decrypt the ciphertexts of owners’ data records, which break the privacy security. AS can even reduce all data records if it knows one of the data records. To solve the aforementioned problem, we propose a highly secure privacy-preserving outsourced k-means clustering scheme under multiple keys in cloud computing. In this paper, noncolluded cloud computing service (CCS) and KMS jointly perform clustering over the encrypted data records without exposing data privacy. Specifically, we use BCP encryption which has additive homomorphic property and AES encryption to double encrypt data records, where the former cryptosystem prevents CCS from obtaining any useful information from received ciphertexts and the latter one protects data records from being decrypted by KMS. We first define five protocols to realize different functions and then present our scheme based on these protocols. Finally, we give the security and performance analyses which show that our scheme is comparable with the existing schemes on functionality and security.
Read full abstract