ABSTRACT This paper designs the MKSE and SEMSS methods. Among them, MKSE uses an improved TF-IDF weight calculation method to extract keywords and applies virtual keywords to construct inverted indexes, making it difficult for malicious attackers to infer the index content easily. SEMSS uses the Apriori algorithm to mine the co-occurrence relationship between words and find the keyword set that meets the minimum support threshold to improve the recall rate of search results. Finally, the security of the scheme is verified from the aspects of semantic security, effici this paper designsency, data integrity, etc. The results showed that the data encryption time of MKSE and TRSE methods increased gradually with the increase in document collection storage. The index build time was increased as the document set grew. The accuracy of the improved TF-IDF method was 63.8%. The running time of Apriori decreased with the increase of minimum support. When the minimum support was 12.0%, the Apriori algorithm ran for 211 seconds. The MKSE method was more efficient than the TRSE method in searching documents by query keywords. When the document set size was 3000, the SEMSS method had a full search rate of 81.09%. This research realizes the semantic security of outsourced data, which can efficiently and comprehensively carry out cryptographic retrieval based on keyword sorting.
Read full abstract