Data-Free Knowledge Distillation with Soft Targeted Transfer Set Synthesis

Zi Wang

doi:10.1609/aaai.v35i11.17228

Abstract

Knowledge distillation (KD) has proved to be an effective approach for deep neural network compression, which learns a compact network (student) by transferring the knowledge from a pre-trained, over-parameterized network (teacher). In traditional KD, the transferred knowledge is usually obtained by feeding training samples to the teacher network to obtain the class probabilities. However, the original training dataset is not always available due to storage costs or privacy issues. In this study, we propose a novel data-free KD approach by modeling the intermediate feature space of the teacher with a multivariate normal distribution and leveraging the soft targeted labels generated by the distribution to synthesize pseudo samples as the transfer set. Several student networks trained with these synthesized transfer sets present competitive performance compared to the networks trained with the original training set and other data-free KD approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Data-Free Knowledge Distillation with Soft Targeted Transfer Set Synthesis

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: May 18, 2021
Citations: 14

Similar Papers

Selective combination of multiple neural networks for improving model prediction in nonlinear systems modelling through forward selection and backward elimination
Zainal Ahmad ... Jie Zhang
Neurocomputing | VOL. 72
Zainal Ahmad, et. al.Zainal Ahmad ... Jie Zhang
29 Feb 2008
Neurocomputing | VOL. 72

Bayesian selective combination of multiple neural networks for improving long-range predictions in nonlinear process modelling
Zainal Ahmad ... Jie Zhang
Neural Computing and Applications | VOL. 14
Zainal Ahmad, et. al.Zainal Ahmad ... Jie Zhang
06 Nov 2004
Neural Computing and Applications | VOL. 14

Effectiveness of Arbitrary Transfer Sets for Data-free Knowledge Distillation
Gaurav Kumar Nayak ... Anirban Chakraborty
-
Gaurav Kumar Nayak, et. al.Gaurav Kumar Nayak ... Anirban Chakraborty
01 Jan 2020
01 Jan 2020

Data-free Knowledge Distillation via Adversarial
Yu Jin ... Chao Li
-
Yu Jin, et. al.Yu Jin ... Chao Li
23 Apr 2021
23 Apr 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Data-Free Knowledge Distillation with Soft Targeted Transfer Set Synthesis

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence