Abstract

The human genome can reveal sensitive information and is potentially re-identifiable, which raises privacy and security concerns about sharing such data on wide scales. In 2016, we organized the third Critical Assessment of Data Privacy and Protection competition as a community effort to bring together biomedical informaticists, computer privacy and security researchers, and scholars in ethical, legal, and social implications (ELSI) to assess the latest advances on privacy-preserving techniques for protecting human genomic data. Teams were asked to develop novel protection methods for emerging genome privacy challenges in three scenarios: Track (1) data sharing through the Beacon service of the Global Alliance for Genomics and Health. Track (2) collaborative discovery of similar genomes between two institutions; and Track (3) data outsourcing to public cloud services. The latter two tracks represent continuing themes from our 2015 competition, while the former was new and a response to a recently established vulnerability. The winning strategy for Track 1 mitigated the privacy risk by hiding approximately 11% of the variation in the database while permitting around 160,000 queries, a significant improvement over the baseline. The winning strategies in Tracks 2 and 3 showed significant progress over the previous competition by achieving multiple orders of magnitude performance improvement in terms of computational runtime and memory requirements. The outcomes suggest that applying highly optimized privacy-preserving and secure computation techniques to safeguard genomic data sharing and analysis is useful. However, the results also indicate that further efforts are needed to refine these techniques into practical solutions.

Highlights

  • Rapid advances in sequencing technologies have enabled the meaningful use of human genomic data in a wide range of healthcare and biomedical applications.[1]

  • Beacons are web-based services that answer queries about allele presence, such as whether a specific nucleotide (e.g., T) exists in a data set for a specific genomic position

  • Since only a subset of the teams submitted papers to the BMC special issue, we provide a link on our competition website[25] to recordings of their presentations for readers who may be interested in the technical details

Read more

Summary

INTRODUCTION

Rapid advances in sequencing technologies have enabled the meaningful use of human genomic data in a wide range of healthcare and biomedical applications.[1]. The best solutions showed encouraging results, with potential use in GWAS while providing provable privacy guarantees.[19,27,28] The 2014 competition, did not address privacy and security issues of storage and computation, which are among the most critical when utilizing cloud computing services to conduct human genomic research. This claim is justified by a recent Science paper[38] that reported on applying secure multiparty computation to study common phenotypes of patients who share the same rare variants across two hospitals In this track, we asked teams to develop SMC solutions for a scenario where privacy was required for coordination between two institutions. We assessed the solutions in terms of (1) accuracy

The bold values indicate the best performance among teams
Findings
AUTHOR CONTRIBUTIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call