A comprehensive survey for generative data augmentation

Yunhao Chen,Zihui Yan,Yunjie Zhu

doi:10.1016/j.neucom.2024.128167

Abstract

Generative data augmentation (GDA) has emerged as a promising technique to alleviate data scarcity in machine learning applications. This thesis presents a comprehensive survey and unified framework of the GDA landscape. We first provide an overview of GDA, discussing its motivation, taxonomy, and key distinctions from synthetic data generation. We then systematically analyze the critical aspects of GDA—selection of generative models, techniques to utilize them, data selection methodologies, validation approaches, and diverse applications. Our proposed unified framework categorizes the extensive GDA literature, revealing gaps such as the lack of universal benchmarks. The thesis summarizes promising research directions, including , effective data selection, theoretical development for large-scale models’ application in GDA and establishing a benchmark for GDA. By laying a structured foundation, this thesis aims to nurture more cohesive development and accelerate progress in the vital arena of generative data augmentation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A comprehensive survey for generative data augmentation

Abstract

Talk to us

Similar Papers

More From: Neurocomputing

Lead the way for us

Journal: Neurocomputing	Publication Date: Jul 5, 2024
Citations: 1

Similar Papers

MLReal: Bridging the gap between training on synthetic data and real data applications in machine learning
Tariq Alkhalifah ... Oleg Ovcharenko
Artificial Intelligence in Geosciences | VOL. 3
Tariq Alkhalifah, et. al.Tariq Alkhalifah ... Oleg Ovcharenko
07 Nov 2022
Artificial Intelligence in Geosciences | VOL. 3

GANs in the Panorama of Synthetic Data Generation Methods
Bruno Vaz ... Álvaro Figueira
ACM Transactions on Multimedia Computing, Communications, and Applications | VOL. -
Bruno Vaz, et. al.Bruno Vaz ... Álvaro Figueira
10 Apr 2024
ACM Transactions on Multimedia Computing, Communications, and Applications | VOL. -

A novel and fully automated platform for synthetic tabular data generation and validation
Hooman H Rashidi ... Bo Hu
Scientific Reports | VOL. 14
Hooman H Rashidi, et. al.Hooman H Rashidi ... Bo Hu
07 Oct 2024
Scientific Reports | VOL. 14

Tool Support for Improving Software Quality in Machine Learning Programs
Kwok Sun Cheng ... Pei-Chi Huang
Information | VOL. 14
Kwok Sun Cheng, et. al.Kwok Sun Cheng ... Pei-Chi Huang
16 Jan 2023
Information | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A comprehensive survey for generative data augmentation

Abstract

Talk to us

Similar Papers

More From: Neurocomputing