PhylaGAN: data augmentation through conditional GANs and autoencoders for improving disease prediction accuracy using microbiome data.

Divya Sharma,Wendy Lou,Wei Xu

doi:10.1093/bioinformatics/btae161

Abstract

Research is improving our understanding of how the microbiome interacts with the human body and its impact on human health. Existing machine learning methods have shown great potential in discriminating healthy from diseased microbiome states. However, Machine Learning based prediction using microbiome data has challenges such as, small sample size, imbalance between cases and controls and high cost of collecting large number of samples. To address these challenges, we propose a deep learning framework phylaGAN to augment the existing datasets with generated microbiome data using a combination of conditional generative adversarial network (C-GAN) and autoencoder. Conditional generative adversarial networks train two models against each other to compute larger simulated datasets that are representative of the original dataset. Autoencoder maps the original and the generated samples onto a common subspace to make the prediction more accurate. Extensive evaluation and predictive analysis was conducted on two datasets, T2D study and Cirrhosis study showing an improvement in mean AUC using data augmentation by 11% and 5% respectively. External validation on a cohort classifying between obese and lean subjects, with a smaller sample size provided an improvement in mean AUC close to 32% when augmented through phylaGAN as compared to using the original cohort. Our findings not only indicate that the generative adversarial networks can create samples that mimic the original data across various diversity metrics, but also highlight the potential of enhancing disease prediction through machine learning models trained on synthetic data. https://github.com/divya031090/phylaGAN.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Bioinformatics (Oxford, England)	Publication Date: Mar 29, 2024
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

PhylaGAN: data augmentation through conditional GANs and autoencoders for improving disease prediction accuracy using microbiome data.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics (Oxford, England)

Lead the way for us

Similar Papers

Using Conditional Generative Adversarial Networks to Boost the Performance of Machine Learning in Microbiome Datasets
Derek Reiman ... Yang Dai
-
Derek Reiman, et. al.Derek Reiman ... Yang Dai
01 Jan 2020
01 Jan 2020

Improving mixed-integer temporal modeling by generating synthetic data using conditional generative adversarial networks: A case study of fluid overload prediction in the intensive care unit
Alireza Rafiei ... Rishikesan Kamaleswaran
Computers in biology and medicine | VOL. 168
Alireza Rafiei, et. al.Alireza Rafiei ... Rishikesan Kamaleswaran
22 Nov 2023
Computers in biology and medicine | VOL. 168

Gan-based data augmentation to improve breast ultrasound and mammography mass classification
Yuliana Jiménez-Gaona ... María José Rodríguez-Álvarez
Biomedical Signal Processing and Control | VOL. 94
Yuliana Jiménez-Gaona, et. al.Yuliana Jiménez-Gaona ... María José Rodríguez-Álvarez
30 Mar 2024
Biomedical Signal Processing and Control | VOL. 94

Designing mm-wave electromagnetic engineered surfaces using generative adversarial networks
Sanaz Mohammadjafari ... Ozan Ozyegen
Neural Computing and Applications | VOL. 33
Sanaz Mohammadjafari, et. al.Sanaz Mohammadjafari ... Ozan Ozyegen
11 Jan 2021
Neural Computing and Applications | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

PhylaGAN: data augmentation through conditional GANs and autoencoders for improving disease prediction accuracy using microbiome data.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics (Oxford, England)