Abstract

BackgroundOne of the tasks in the iDASH Secure Genome Analysis Competition in 2016 was to demonstrate the feasibility of privacy-preserving queries on homomorphically encrypted genomic data. More precisely, given a list of up to 100,000 mutations, the task was to encrypt the data using homomorphic encryption in a way that allows it to be stored securely in the cloud, and enables the data owner to query the dataset for the presence of specific mutations, without revealing any information about the dataset or the queries to the cloud.MethodsWe devise a novel string matching protocol to enable privacy-preserving queries on homomorphically encrypted data. Our protocol combines state-of-the-art techniques from homomorphic encryption and private set intersection protocols to minimize the computational and communication cost.ResultsWe implemented our protocol using the homomorphic encryption library SEAL v2.1, and applied it to obtain an efficient solution to the iDASH competition task. For example, using 8 threads, our protocol achieves a running time of only 4 s, and a communication cost of 2 MB, when querying for the presence of 5 mutations from an encrypted dataset of 100,000 mutations.ConclusionsWe demonstrate that homomorphic encryption can be used to enable an efficient privacy-preserving mechanism for querying the presence of particular mutations in realistic size datasets. Beyond its applications to genomics, our protocol can just as well be applied to any kind of data, and is therefore of independent interest to the homomorphic encryption community.

Highlights

  • One of the tasks in the iDASH Secure Genome Analysis Competition in 2016 was to demonstrate the feasibility of privacy-preserving queries on homomorphically encrypted genomic data

  • While many of the techniques and algorithms presented in this paper are agnostic to the exact homomorphic encryption scheme that is being used, for simplicity we will restrict to Ring LWE-based cryptosystems using power-of-2 cyclotomic rings of integers [23]

  • When constructing the hashed dataset H(D), failing to construct the B cuckoo hash tables obtained within it may reveal to the server that a certain number of hash collisions occurred while hashing the dataset

Read more

Summary

Introduction

One of the tasks in the iDASH Secure Genome Analysis Competition in 2016 was to demonstrate the feasibility of privacy-preserving queries on homomorphically encrypted genomic data. In 2015 and 2016, iDASH (integrating Data for Analysis, Anonymization, and Sharing) hosted two international contests on Secure Genome Analysis. Teams from around the world participated to test the limits of secure computation on genomic data, and benchmark solutions on real data sets. Such contests serve to bring together experts in security, cryptography, and bioinformatics to quickly make progress on interdisciplinary challenges. The task for outsourced storage and computation this year was to implement a method for private queries— string matching on encrypted genomic data

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call