Abstract
BackgroundGenome-wide methylation arrays are increasingly used tools in studies of complex medical disorders. Because of their expense and potential utility to the scientific community, current federal policy dictates that data from these arrays, like those from genome-wide genotyping arrays, be deposited in publicly available databases. Unlike the genotyping information, access to the expression data is not restricted. An underlying supposition in the current nonrestricted access to methylation data is the belief that protected health and personal identifying information cannot be simultaneously extracted from these arrays.ResultsIn this communication, we analyze methylation data from the Illumina HumanMethylation450 array and show that genotype at 1,069 highly informative loci, and both alcohol and smoking consumption information, can be derived from the array data.ConclusionsWe conclude that both potentially personally identifying information and substance-use histories can be simultaneously derived from methylation array data. Because access to genetic information about a database subject or one of their relatives is critical to the de-identification process, this risk of de-identification is limited at the current time. We propose that access to genome-wide methylation data be restricted to institutionally approved investigators who accede to data use agreements prohibiting re-identification.Electronic supplementary materialThe online version of this article (doi:10.1186/1868-7083-6-28) contains supplementary material, which is available to authorized users.
Highlights
Genome-wide methylation arrays are increasingly used tools in studies of complex medical disorders
A major factor in their rapid growth has been policies mandating the deposition of all genome-wide array data in publicly available repositories, such as those administered by the National Center for Biotechnology Information (NCBI) [1]
We describe a method by which information from a DNA methylation array could be used to generate individually identifying genetic profiles and to infer the substance-use consumption of study participants
Summary
Genome-wide methylation arrays are increasingly used tools in studies of complex medical disorders Because of their expense and potential utility to the scientific community, current federal policy dictates that data from these arrays, like those from genome-wide genotyping arrays, be deposited in publicly available databases. A major factor in their rapid growth has been policies mandating the deposition of all genome-wide array data in publicly available repositories, such as those administered by the National Center for Biotechnology Information (NCBI) [1]. Without a doubt, these policies have led to significant advances in many areas including evolutionary biology and healthcare.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.