Abstract

Traditional term frequency-inverse document frequency model-based privacy-preserving ranked search schemes rarely consider the latent semantic meanings of documents and keywords. It is a challenge to design efficient semantic-aware ranked search (SRSE) schemes with privacy preservation. In this paper, two privacy-preserving SRSE schemes are developed for the cloud environments. The first scheme is the accuracy-first search scheme. In this scheme, the Latent Dirichlet Allocation topic model is adopted to generate the topic-based semantic information-embedded vectors for documents and queried keywords, which supports semantic-aware relevance measurement. The bisecting k-means clustering algorithm is used to build an accuracy-first filtering tree index (AFF-tree), and the AFF-tree-based search algorithm is proposed to achieve the accuracy-first ranked search. The second scheme is the efficiency-first search scheme. It performs a structure optimization on the AFF-tree, and a newly efficiency-first filtering tree index (EFF-tree) is designed. By using the EFF-tree, an anchor node-based search algorithm is designed to achieve the efficiency-first ranked search at the expense of a little decrease in search result precision. The secure inner product is used to perform privacy-preserving semantic-aware relevance measurement between documents and queried keywords in both schemes. To analyze the security of the proposed schemes, the game stimulation-based proof is presented. Experimental results show the better performance of the proposed schemes in search time cost.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.