Abstract

Modern machine and deep learning methods require large datasets to achieve reliable and robust results. This requirement is often difficult to meet in the medical field, due to data sharing limitations imposed by privacy regulations or the presence of a small number of patients (e.g., rare diseases). To address this data scarcity and to improve the situation, novel generative models such as Generative Adversarial Networks (GANs) have been widely used to generate synthetic data that mimic real data by representing features that reflect health-related information without reference to real patients. In this paper, we consider several GAN models to generate synthetic data used for training binary (malignant/benign) classifiers, and compare their performances in terms of classification accuracy with cases where only real data are considered. We aim to investigate how synthetic data can improve classification accuracy, especially when a small amount of data is available. To this end, we have developed and implemented an evaluation framework where binary classifiers are trained on extended datasets containing both real and synthetic data. The results show improved accuracy for classifiers trained with generated data from more advanced GAN models, even when limited amounts of original data are available.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.