Generation of Synthetic Data with Conditional Generative Adversarial Networks

Belén Vega-Márquez,Isabel Nepomuceno-Chamorro,Cristina Rubio-Escudero

doi:10.1093/jigpal/jzaa059

Belén Vega-Márquez, Isabel Nepomuceno-Chamorro + Show 1 more

Open Access

https://doi.org/10.1093/jigpal/jzaa059

Copy DOI

Journal: Logic Journal of the IGPL	Publication Date: Nov 11, 2020
Citations: 4	License type: cc-by-nc-nd

Affiliation: Universidad de Sevilla

Abstract

Abstract The generation of synthetic data is becoming a fundamental task in the daily life of any organization due to the new protection data laws that are emerging. Because of the rise in the use of Artificial Intelligence, one of the most recent proposals to address this problem is the use of Generative Adversarial Networks (GANs). These types of networks have demonstrated a great capacity to create synthetic data with very good performance. The goal of synthetic data generation is to create data that will perform similarly to the original dataset for many analysis tasks, such as classification. The problem of GANs is that in a classification problem, GANs do not take class labels into account when generating new data, it is treated as any other attribute. This research work has focused on the creation of new synthetic data from datasets with different characteristics with a Conditional Generative Adversarial Network (CGAN). CGANs are an extension of GANs where the class label is taken into account when the new data is generated. The performance of our results has been measured in two different ways: firstly, by comparing the results obtained with classification algorithms, both in the original datasets and in the data generated; secondly, by checking that the correlation between the original data and those generated is minimal.

Full Text