Abstract

Sharing human genotype and phenotype data is essential to discover otherwise inaccessible genetic associations, but is a challenge because of privacy concerns. Here, we present a method of homomorphic encryption that obscures individuals' genotypes and phenotypes, and is suited to quantitative genetic association analysis. Encrypted ciphertext and unencrypted plaintext are analytically interchangeable. The encryption uses a high-dimensional random linear orthogonal transformation key that leaves the likelihood of quantitative trait data unchanged under a linear model with normally distributed errors. It also preserves linkage disequilibrium between genetic variants and associations between variants and phenotypes. It scrambles relationships between individuals: encrypted genotype dosages closely resemble Gaussian deviates, and can be replaced by quantiles from a Gaussian with negligible effects on accuracy. Likelihood-based inferences are unaffected by orthogonal encryption. These include linear mixed models to control for unequal relatedness between individuals, heritability estimation, and including covariates when testing association. Orthogonal transformations can be applied in a modular fashion for multiparty federated mega-analyses where the parties first agree to share a common set of genotype sites and covariates prior to encryption. Each then privately encrypts and shares their own ciphertext, and analyses all parties' ciphertexts. In the absence of private variants, or knowledge of the key, we show that it is infeasible to decrypt ciphertext using existing brute-force or noise-reduction attacks. We present the method as a challenge to the community to determine its security.

Highlights

  • Sharing human genotype and phenotype data is essential to discover otherwise inaccessible genetic associations, but is a challenge because of privacy concerns

  • Only the summary statistics of genome-wide association studies (GWAS) are distributed, typically comprising the regression coefficients and P-values of the genetic variants tested for association with the phenotype, for a federated meta-analysis

  • Homomorphic encryption (HE) refers to cryptographic systems that allow computations to be performed on encrypted data without decrypting it, and which yield the same answers as when the analogous computations are performed on the original data

Read more

Summary

Introduction

Sharing human genotype and phenotype data is essential to discover otherwise inaccessible genetic associations, but is a challenge because of privacy concerns. The encryption uses a high-dimensional random linear orthogonal transformation key that leaves the likelihood of quantitative trait data unchanged under a linear model with normally distributed errors It preserves linkage disequilibrium between genetic variants and associations between variants and phenotypes. Homomorphic encryption (HE) refers to cryptographic systems that allow computations to be performed on encrypted data (the ciphertext) without decrypting it, and which yield the same answers as when the analogous computations are performed on the original data (the plaintext) It is an active area of research in computer science because it could make cloud computing much more secure, for both genetic and other applications. Should a cloud service be compromised, any stolen ciphertext would be valueless

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.