Abstract

BackgroundGenomic sequencing of SNPs is increasingly prevalent, though the amount of familial information these data contain has not been quantified.MethodsWe provide a framework for measuring the risk to siblings of a patient's SNP genotype disclosure, and demonstrate that sibling SNP genotypes can be inferred with substantial accuracy.ResultsExtending this inference technique, we determine that a very low number of matches at commonly varying SNPs is sufficient to confirm sib-ship, demonstrating that published sequence data can reliably be used to derive sibling identities. Using HapMap trio data, at SNPs where one child is homozygotic major, with a minor allele frequency ≤ 0.20, (N = 452684, 65.1%) we achieve 91.9% inference accuracy for sibling genotypes.ConclusionThese findings demonstrate that substantial discrimination and privacy risks arise from use of inferred familial genomic data.

Highlights

  • Genomic sequencing of SNPs is increasingly prevalent, though the amount of familial information these data contain has not been quantified

  • To quantify the risk of SNP disclosure to relatives, we demonstrate a model for inferring sibling genotypes using proband SNP data and population-specific allele frequency databases, such as the HapMap[10,11]

  • There are other genomic data types which should be considered in a rigorous privacy and propensity analysis, including copy number variant and mutation data. These findings demonstrate that substantial discrimination and privacy concerns arise from use of inferred familial genomic data

Read more

Summary

Introduction

Genomic sequencing of SNPs is increasingly prevalent, though the amount of familial information these data contain has not been quantified. Genomic data are increasingly integrated into clinical environments, stored in genealogical and medical records[1,2] and shared with the broader research community[3,4] without full appreciation of the extent to which these commodity level measurements may disclose the health risks or even identity of family members. Unlike conventional fingerprints, which provide little direct information about patients or relatives, SNP genotypes may encode phenotypic characteristics, which can link sequences to people[6]. Despite these privacy issues[7,8], use of genetic sequencing is increasing in (page number not for citation purposes)

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call