Abstract
Although cloud storage technology can provide users with convenient storage services, a large amount of duplicate data in the cloud brings a huge storage burden and the risk of privacy leakage. To improve the utilization of cloud storage resources and protect data confidentiality, random message lock encryption technology (R-MLE) can be used to delete redundant data in the cloud. But the theoretical basis of the deduplication scheme based on R-MLE is bilinear mapping, so the computational cost of finding duplicate fingerprint-tags is relatively large. To improve the deduplication efficiency, we proposed a secure deduplication scheme based on the autoencoder model in our previous research, using the model to generate the abstract-tags of the data, and using the similarity of the abstract-tags to quickly filter out the fingerprint-tags with high repeatability, which greatly reduces the number of fingerprint-tag comparisons. On this basis, this paper further proposes a secure deduplication method based on k-means clustering. First, the abstract-tags in cloud storage are clustered, and then the distance between the abstract-tags uploaded by users and the centroid is calculated. Then, the abstract-tags of the category with the closest distance are selected. Finally, duplicate data detection is performed only on the fingerprint-tags corresponding to these abstract-tags. In this way, the filtering speed of fingerprint-tags can be further accelerated. Experiments show that our method has higher performance than the secure deduplication method based on the autoencoder model.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.