#5490 GENERATIVE ARTIFICIAL INTELLIGENCE FOR CREATION OF SYNTHETIC HYPERTENSION TRIAL DATA

Chirag Jain,Conor Judge

doi:10.1093/ndt/gfad063c_5490

Abstract

Abstract Background and Aims Synthetic data can be an effective supplement or alternative to real data for the training of machine learning models. Synthetic data may also be used to evaluate new tools, develop educational curricula, or remove undesirable biases in datasets. We aim to evaluate four synthetic data generation methods applied to hypertension randomized clinical trial data. Method The Systolic Blood Pressure Intervention Trial (SPRINT) trial showed that intensive BP control to SBP &lt;120 mm Hg results in significant cardiovascular benefits in high-risk patients with hypertension compared with routine BP control to &lt;140 mm Hg. The Synthetic Data Vault (SDV) is a Synthetic Data Generation ecosystem of libraries that allows users to easily generate new Synthetic Data that has the same format and statistical properties as the original dataset. SDV supports multiple types of data, including date-times, discrete-ordinal, categorical, and numerical. SPRINT data was pre-processed to create a single table of 140,000 patient visits with baseline variables (age, sex, race, aspirin use, estimated Glomerular Filtration Rate (eGFR)) and visit level variables (systolic and diastolic blood pressure, heart rate and total number of antihypertensive medications at end of visit). Using the SDV library for python, we used four generative models to create synthetic SPRINT data, 1. Gaussian copula model, 2. Conditional Tabular Generative adversarial network (CTGAN), 3. CopulaGan model, and 4. Tabular Variational Auto-encode (TVAE). We evaluated the results using the SDMetrics library which includes the shapes of the columns (marginal distributions), the pairwise trends between the columns (correlations), reproduce mathematical properties from your original data and new row synthesis. Finally, an overall quality score which represents an amalgamation of the marginal distribution and correlations was computed, where 0 indicates the lowest quality and 1 indicates the highest. Results Two hundred thousand synthetic patient visits were created for each method. The overall quality scores in order were 90.67% for Gaussian copula, 86.77% for TVAE, 81.03% for CTGAN’, and 79.7% for CopulaGAN. The column shape score which represents the marginal distribution was highest for Gaussian Copula (94.54%), followed by TVAE (88.44%), CTGAN (82.35%), and Copula GAN (80.27%). The column pair trend which corresponds to correlations was highest for Gaussian Copula (86.8%), followed by TAVE (85.1%), CTGAN (79.72%), and Copula GAN (79.12%). Conclusion Gaussian copula created the highest scoring synthetic SPRINT data based on the marginal distribution, correlations, and overall score. The Synthetic Data Vault is a feasible collection of methods for generation of synthetic clinical trial data for training future machine learning and AI models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

#5490 GENERATIVE ARTIFICIAL INTELLIGENCE FOR CREATION OF SYNTHETIC HYPERTENSION TRIAL DATA

Abstract

Talk to us

Similar Papers

More From: Nephrology Dialysis Transplantation

Lead the way for us

Similar Papers

Generation of meaningful synthetic sensor data — Evaluated with a reliable transferability methodology
Michael Meiser ... Ingo Zinnikus
Energy and AI | VOL. 15
Michael Meiser, et. al.Michael Meiser ... Ingo Zinnikus
13 Oct 2023
Energy and AI | VOL. 15

Machine learning-based amide proton transfer imaging using partially synthetic training data.
Malvika Viswanathan ... Yashwant Kurmi
Magnetic Resonance in Medicine | VOL. 91
Malvika Viswanathan, et. al.Malvika Viswanathan ... Yashwant Kurmi
14 Dec 2023
Magnetic Resonance in Medicine | VOL. 91

Generation of synthetic EEG data for training algorithms supporting the diagnosis of major depressive disorder.
Friedrich Philipp Carrle ... Alexandra Reichenbach
Frontiers in neuroscience | VOL. 17
Friedrich Philipp Carrle, et. al.Friedrich Philipp Carrle ... Alexandra Reichenbach
02 Oct 2023
Frontiers in neuroscience | VOL. 17

Estimating Systolic Blood Pressure Intervention Trial Participant Posttrial Survival Using Pooled Epidemiologic Cohort Data.
Brandon K Bellows ... William C Cushman
Journal of the American Heart Association | VOL. 10
Brandon K Bellows, et. al.Brandon K Bellows ... William C Cushman
06 May 2021
Journal of the American Heart Association | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

#5490 GENERATIVE ARTIFICIAL INTELLIGENCE FOR CREATION OF SYNTHETIC HYPERTENSION TRIAL DATA

Abstract

Talk to us

Similar Papers

More From: Nephrology Dialysis Transplantation