Cloud storage allows organizations to store data at remote sites of service providers. Although cloud storage services offer numerous benefits, they also involve new risks and challenges with respect to data security and privacy aspects. To preserve confidentiality, data must be encrypted before outsourcing to the cloud. Although this approach protects the security and privacy aspects of data, it also impedes regular functionality such as executing queries and performing analytical computations. To address this concern, specific data encryption schemes (e.g., deterministic, random, homomorphic, order-preserving, etc.) can be adopted that still support the execution of different types of queries (e.g., equality search, full-text search, etc.) over encrypted data.However, these specialized data encryption schemes have to be implemented and integrated in the application and their adoption introduces an extra layer of complexity in the application code. Moreover, as these schemes imply trade-offs between performance and security, storage efficiency, etc, making the appropriate trade-off is a challenging and non-trivial task. In addition, to support aggregate queries, User Defined Functions (UDF) have to be implemented directly in the database engine and these implementations are specific to each underlying data storage technology, which demands expert knowledge and in turn increases management complexity.In this paper, we introduce CryptDICE, a distributed data protection system that (i) provides built-in support for a number of different data encryption schemes, made accessible via annotations that represent application-specific (search) requirements; (ii) supports making appropriate trade-offs and execution of these encryption decisions at diverse levels of data granularity; and (iii) integrates a lightweight service that performs dynamic deployment of User Defined Functions (UDF) –without performing any alteration directly in the database engine– for heterogeneous NoSQL databases in order to realize low-latency aggregate queries and also to avoid expensive data shuffling (from the cloud to an on-premise data center). We have validated CryptDICE in the context of a realistic industrial SaaS application and carried out an extensive functional validation, which shows the applicability of the middleware platform. In addition, our experimental evaluation efforts confirm that the performance overhead of CryptDICE is acceptable and validates the performance optimizations for achieving low-latency aggregate queries.
Read full abstract