Weighted Itemsets Error (WIE) Approach for Evaluating Generated Synthetic Patient Data

Mojtaba Zare,Janusz Wojtusiak

doi:10.1109/icmla.2018.00166

Abstract

Patient data are regarded as highly sensitive and protected information by federal, state and local policies that make it available to only those who have been given access to Protected Health Information (PHI). In many applications, the access to PHI and real patient data can be substituted with generated realistic synthetic data used instead of real patient data. While methods exist that can generate synthetic data, it is unclear how to evaluate synthetic data quality. The objective of this paper is to present investigation of a new method for statistically testing the quality of synthetic patient data. Weighted Itemsets Error (WIE) measure compares frequent itemsets in the synthetic data with expected itemsets in real data, thus allowing for evaluating cooccurrence of data items. The derived measure is tested in the context of synthetic data comprising of medical diagnoses. The results demonstrate the effects of parameters that control WIE measure, and indicate that WIE is a simple yet powerful approach for evaluating synthetic datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Weighted Itemsets Error (WIE) Approach for Evaluating Generated Synthetic Patient Data

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

The potential synergies between synthetic data and in silico trials in relation to generating representative virtual population cohorts
Puja Myles ... Johan Ordish
Progress in Biomedical Engineering | VOL. 5
Puja Myles, et. al.Puja Myles ... Johan Ordish
01 Jan 2023
Progress in Biomedical Engineering | VOL. 5

Analyzing Medical Research Results Based on Synthetic Data and Their Relation to Real Data Results: Systematic Comparison From Five Observational Studies.
Anat Reiner Benaim ... Irit Hochberg
JMIR Medical Informatics | VOL. 8
Anat Reiner Benaim, et. al.Anat Reiner Benaim ... Irit Hochberg
20 Feb 2020
JMIR Medical Informatics | VOL. 8

Systematic Evaluation of Synthetic Panel Data Quality with an Application to Chronic Lymphocytic Leukemia
Dimitris Karletsos ... Andy Wilson
Blood | VOL. 140
Dimitris Karletsos, et. al.Dimitris Karletsos ... Andy Wilson
15 Nov 2022
Blood | VOL. 140

GLSTM: A novel approach for prediction of real & synthetic PID diabetes data using GANs and LSTM classification model
Priyanka Gupta ... Sushma Jaiswal
International Journal of Experimental Research and Review | VOL. 30
Priyanka Gupta, et. al.Priyanka Gupta ... Sushma Jaiswal
30 Apr 2023
International Journal of Experimental Research and Review | VOL. 30

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Weighted Itemsets Error (WIE) Approach for Evaluating Generated Synthetic Patient Data

Abstract

Talk to us

Similar Papers