In 2024, medical researchers in the Republic of Korea were invited to amend the health and medical data utilization guidelines (Government Publications Registration Number: 11-1352000-0052828-14). This study aimed to show the overall impact of the guideline revision, with a focus on clinical genomic data. This study amended the pseudonymization of genomic data defined in the previous version through a joint study led by the Ministry of Health and Welfare, the Korea Health Information Service, and the Korea Genome Organization. To develop the previous version, we held three conferences with four main medical research institutes and seven academic societies. We conducted two surveys targeting special genome experts in academia, industry, and institutes. We found that cases of pseudonymization in the application of genome data were rare and that there was ambiguity in the terminology used in the previous version of the guidelines. Most experts (>~90%) agreed that the 'reserved' condition should be eliminated to make genomic data available after pseudonymization. In this study, the scope of genomic data was defined as clinical next-generation sequencing data, including FASTQ, BAM/SAM, VCF, and medical records. Pseudonymization targets genomic sequences and metadata, embedding specific elements, such as germline mutations, short tandem repeats, single-nucleotide polymorphisms, and identifiable data (for example, ID or environmental values). Expression data generated from multi-omics can be used without pseudonymization. This amendment will not only enhance the safe use of healthcare data but also promote advancements in disease prevention, diagnosis, and treatment.
Read full abstract