Abstract

Traditional searchable encryption schemes adopting the bag-of-words model occupy massive space to store the document set's index, where the dimension of the document vector is equal to the scale of the dictionary. The bag-of-words model also ignores the semantic information between keywords and documents, which could return non-relevant search results to users. The neutral-network based natural language processing method - Doc2Vec model use word's and paragraph's context information to extract documents' features. The features contain latent semantics information and can measure the similarity between documents. In this paper, we adopt the Doc2Vec model to achieve a semantic-aware multikeyword ranked search scheme. Doc2Vec model uses the distributed representation of words and documents with a modest dimensionality of vectors while trained on a dataset with a few hundred of millions of words. Documents' distributed representations are extracted as documents feature vector by Doc2Vec model and utilized as the search index. The features of the queried keywords are also extracted as the query feature vector, and the secure inner product operation is adopted to achieve privacy-preserving semantic search with the query feature vector and index. Our scheme can support dynamic update on the document set with Doc2Vec model. The experiment on a real-world dataset shows that the fixed-length feature vector can improve the time and space efficiency on the semantic-aware search.

Highlights

  • As the rapid development of cloud service, there are a large number of users uploading their private data to save the data maintenance cost

  • Many researchers have proposed a number of searchable encryption schemes such as single keyword search [1]–[7], multikeyword search [8]–[19], fuzzy keyword search [20]–[24] and conjunctive keyword search [25]–[28], etc

  • We propose a semantic-aware multikeyword ranked search scheme over encrypted cloud data based on the Dov2Vec model, which incorporates semantic features of the users’ search intention in searchable encryption

Read more

Summary

INTRODUCTION

As the rapid development of cloud service, there are a large number of users uploading their private data to save the data maintenance cost. By adopting the Doc2Vec model, our scheme uses the semantic features to achieve semantic-aware search over encrypted cloud data. Wang et al [6] proposed a searchable single-keyword ranked search encryption scheme Their scheme used the TF-IDF model for document representation, which is an effective model for feature extraction. We use the Doc2Vec model to extract features from the documents dataset and achieve semantic-aware multikeyword ranked search. B. WORD2VEC AND DOC2VEC Traditional searchable encryption schemes mostly adopt the bag-of-words model to transfer every document to a document vector, which the dimension is equal to the scale of the dictionary. WORD2VEC AND DOC2VEC Traditional searchable encryption schemes mostly adopt the bag-of-words model to transfer every document to a document vector, which the dimension is equal to the scale of the dictionary Where VQ is the query topic vector of Q while DVi and DVj are the document feature vectors of di and dj respectively

SYSTEM MODEL
DMRSE SCHEME
PERFORMANCE ANALYSIS
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.