Dataset Condensation via Expert Subspace Projection.

Zhiheng Ma,Xing Wei,Dezheng Gao,Shaolei Yang,Yihong Gong

doi:10.3390/s23198148

Abstract

The rapid growth in dataset sizes in modern deep learning has significantly increased data storage costs. Furthermore, the training and time costs for deep neural networks are generally proportional to the dataset size. Therefore, reducing the dataset size while maintaining model performance is an urgent research problem that needs to be addressed. Dataset condensation is a technique that aims to distill the original dataset into a much smaller synthetic dataset while maintaining downstream training performance on any agnostic neural network. Previous work has demonstrated that matching the training trajectory between the synthetic dataset and the original dataset is more effective than matching the instantaneous gradient, as it incorporates long-range information. Despite the effectiveness of trajectory matching, it suffers from complex gradient unrolling across iterations, which leads to significant memory and computation overhead. To address this issue, this paper proposes a novel approach called Expert Subspace Projection (ESP), which leverages long-range information while avoiding gradient unrolling. Instead of strictly enforcing the synthetic dataset's training trajectory to mimic that of the real dataset, ESP only constrains it to lie within the subspace spanned by the training trajectory of the real dataset. The memory-saving advantage offered by our method facilitates unbiased training on the complete set of synthetic images and seamless integration with other dataset condensation techniques. Through extensive experiments, we have demonstrated the effectiveness of our approach. Our method outperforms the trajectory matching method on CIFAR10 by 16.7% in the setting of 1 Image/Class, surpassing the previous state-of-the-art method by 3.2%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Dataset Condensation via Expert Subspace Projection.

Abstract

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)

Lead the way for us

Journal: Sensors (Basel, Switzerland)	Publication Date: Sep 28, 2023
License type: CC BY 4.0

Similar Papers

Machine learning models trained on synthetic datasets of multiple sample sizes for the use of predicting blood pressure from clinical data in a national dataset.
Anmol Arora ... Ananya Arora
PloS one | VOL. 18
Anmol Arora, et. al.Anmol Arora ... Ananya Arora
16 Mar 2023
PloS one | VOL. 18

Application of a Data Augmentation Technique on Blast-Induced Fly-Rock Distance Prediction
Biao He ... Danial Jahed Armaghani
-
Biao He, et. al.Biao He ... Danial Jahed Armaghani
01 Jan 2023
01 Jan 2023

MapDiff-FI : Map different sets for frequent itemsets mining
Thaweesak Khongtuk ... Chuleerat Jaruskulchai
-
Thaweesak Khongtuk, et. al.Thaweesak Khongtuk ... Chuleerat Jaruskulchai
01 Jan 2018
01 Jan 2018

Generation of a Melanoma and Nevus Data Set From Unstandardized Clinical Photographs on the Internet
Soo Ick Cho ... Seung Seog Han
JAMA dermatology | VOL. 159
Soo Ick Cho, et. al.Soo Ick Cho ... Seung Seog Han
04 Oct 2023
JAMA dermatology | VOL. 159

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Dataset Condensation via Expert Subspace Projection.

Abstract

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)