Implementing Privacy-Preserving Genotype Analysis with Consideration for Population Stratification

Andre Ostrak,Ville Sokk,Liina Kamm,Jaak Randmets,Sven Laur

doi:10.3390/cryptography5030021

Andre Ostrak, Ville Sokk + Show 3 more

Open Access

https://doi.org/10.3390/cryptography5030021

Copy DOI

Journal: Cryptography	Publication Date: Aug 20, 2021
Citations: 5	License type: CC BY 4.0

Affiliation: Cybernetica (Estonia), University of Tartu

Abstract

In bioinformatics, genome-wide association studies (GWAS) are used to detect associations between single-nucleotide polymorphisms (SNPs) and phenotypic traits such as diseases. Significant differences in SNP counts between case and control groups can signal association between variants and phenotypic traits. Most traits are affected by multiple genetic locations. To detect these subtle associations, bioinformaticians need access to more heterogeneous data. Regulatory restrictions in cross-border health data exchange have created a surge in research on privacy-preserving solutions, including secure computing techniques. However, in studies of such scale, one must account for population stratification, as under- and over-representation of sub-populations can lead to spurious associations. We improve on the state of the art of privacy-preserving GWAS methods by showing how to adapt principal component analysis (PCA) with stratification control (EIGENSTRAT), FastPCA, EMMAX and the genomic control algorithm for secure computing. We implement these methods using secure computing techniques—secure multi-party computation (MPC) and trusted execution environments (TEE). Our algorithms are the most complex ones at this scale implemented with MPC. We present performance benchmarks and a security and feasibility trade-off discussion for both techniques.

Highlights

These groups are compared to each other in the framework of case–control studies to find in the DNA sequence single-nucleotide polymorphisms (SNPs) that are significantly overrepresented in one group
We first describe and discuss EIGENSTRAT and FastPCA, and we go to EMMAX and we describe the genomic control algorithm
As the privacy-preserving EMMAX algorithm is significantly slower than the privacy-preserving principal component analysis (PCA) algorithm, we looked at the running times for 1000, 5000 and 20,000 SNPs for 217 donors and looked at 1000 SNPs for 100 and 434 donors

Summary

Introduction

At least two of these problems—the existence of polygenic phenotypes, and population stratification—can be alleviated with the use of more heterogenous databases with larger volumes of data

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Implementing Privacy-Preserving Genotype Analysis with Consideration for Population Stratification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Cryptography

Lead the way for us

Similar Papers

Genome-Wide Association Studies Go Green: Novel and Cost-Effective Opportunities for Identifying Genetic Associations
Celine M Vachon
Mayo Clinic Proceedings | VOL. 86
Celine M VachonCeline M Vachon
01 Jul 2011
Mayo Clinic Proceedings | VOL. 86

Integrative analysis of liver-specific non-coding regulatory SNPs associated with the risk of coronary artery disease
Ilakya Selvarajan ...
The American Journal of Human Genetics | VOL. 108
Ilakya Selvarajan, et. al.Ilakya Selvarajan ...
23 Feb 2021
The American Journal of Human Genetics | VOL. 108

Privacy-Preserving Analytics, Processing and Data Management
Kalmer Keerup ... Baldur Kubo
-
Kalmer Keerup, et. al.Kalmer Keerup ... Baldur Kubo
01 Jan 2020
01 Jan 2020

Impact on venous thrombosis risk of newly discovered gene variants associated with FVIII and VWF plasma levels
P.‐E Morange ... D.‐A Trégouët
Journal of Thrombosis and Haemostasis | VOL. 9
P.‐E Morange, et. al.P.‐E Morange ... D.‐A Trégouët
01 Jan 2010
Journal of Thrombosis and Haemostasis | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Implementing Privacy-Preserving Genotype Analysis with Consideration for Population Stratification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Cryptography