As genomic research continues to advance, sharing of genomic data and research outcomes has become increasingly important for fostering collaboration and accelerating scientific discovery. However, such data sharing must be balanced with the need to protect the privacy of individuals whose genetic information is being utilized. This paper presents a bidirectional framework for evaluating privacy risks associated with data shared (both in terms of summary statistics and research datasets) in genomic research papers, particularly focusing on re-identification risks such as membership inference attacks (MIA). The framework consists of a structured workflow that begins with a questionnaire designed to capture researchers' (authors') self-reported data sharing practices and privacy protection measures. Responses are used to calculate the risk of re-identification for their study (paper) when compared with the National Institutes of Health (NIH) genomic data sharing policy. Any gaps in compliance help us to identify potential vulnerabilities and encourage the researchers to enhance their privacy measures before submitting their research for publication. The paper also demonstrates the application of this framework, using published genomic research as case study scenarios to emphasize the importance of implementing bidirectional frameworks to support trustworthy open science and genomic data sharing practices.
Read full abstract