The performance cost of preserving data/query privacy using searchable symmetric encryption

Shaun Mc Brearty,William Farrelly,Kevin Curran

doi:10.1002/sec.1699

Abstract

AbstractThe benefits of Cloud computing include reduced costs, high reliability and the immediate availability of additional computing resources as needed. Despite such advantages, Cloud Service Provider (CSP) consumers need to be aware that the Clouds poses its own set of unique risks that are not typically associated with storing and processing one's own data internally using privately owned infrastructure. New techniques such as Searchable Encryption are being deployed to enable data to be encrypted online. Despite being a relatively obscure form of Cryptography, Searchable Encryption is now at the point that it can be deployed and used within the Cloud. Searchable Encryption can allow CSP customers to store their data in encrypted form, while retaining the ability to search that data without disclosing the associated decryption key(s) to CSPs. Searchable Encryption is a diverse subject that exists in many forms. Searchable Symmetric Encryption (SSE) which has its roots in plaintext searching is one such form. Although symmetrically encrypted ciphertext cannot be searched in the same manner; nonetheless, many of the principles that apply to plaintext searching also apply to SSE. In its most basic form, SSE is nothing more than an Inverted Index—a mechanism that has been used in plaintext Information Retrieval (IR) for decades—that has been modified and adapted for use with ciphertext. We implement an SSE scheme and evaluate the efficiency of storing and retrieving data from the cloud. The results showed that carrying out a task using SSE is directly proportional to the amount of information involved. In the case of constructing an IR Inverted Index, the results show that the time taken to generate an IR Inverted Index is directly proportional to the number of Terms contained in the underlying Document Collection. Converting the same IR Inverted Index to an SSE Inverted Index is directly proportional to the number of Postings contained within the IR Inverted Index, while the time taken to encrypt the underlying Document Collection is directly proportional to the number of Terms contained within the Document Collection. In relation to searching in SSE, the time taken to identify and decrypt the set of Postings associated with a given Lexicon Term is directly proportional to the number of Postings. We believe that SSE is efficient enough to be deployed in a Cloud environment especially when results only have to be returned to the user in small quantities. When applied to large Sets, SSE querying can become inefficient as its search time is directly proportional to the number of matching. SSE however is designed to achieve efficient search speeds whilst maintaining Data Privacy. Copyright © 2016 John Wiley & Sons, Ltd.

Full Text