Data-Distortion Guided Self-Distillation for Deep Neural Networks

Ting-Bing Xu,Cheng-Lin Liu

doi:10.1609/aaai.v33i01.33015565

Abstract

Knowledge distillation is an effective technique that has been widely used for transferring knowledge from a network to another network. Despite its effective improvement of network performance, the dependence of accompanying assistive models complicates the training process of single network in the need of large memory and time cost. In this paper, we design a more elegant self-distillation mechanism to transfer knowledge between different distorted versions of same training data without the reliance on accompanying models. Specifically, the potential capacity of single network is excavated by learning consistent global feature distributions and posterior distributions (class probabilities) across these distorted versions of data. Extensive experiments on multiple datasets (i.e., CIFAR-10/100 and ImageNet) demonstrate that the proposed method can effectively improve the generalization performance of various network architectures (such as AlexNet, ResNet, Wide ResNet, and DenseNet), outperform existing distillation methods with little extra training efforts.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Data-Distortion Guided Self-Distillation for Deep Neural Networks

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Jul 17, 2019
Citations: 124

Similar Papers

Deep Neural Network Self-Distillation Exploiting Data Representation Invariance.
Ting-Bing Xu ... Cheng-Lin Liu
IEEE Transactions on Neural Networks and Learning Systems | VOL. 33
Ting-Bing Xu, et. al.Ting-Bing Xu ... Cheng-Lin Liu
19 Oct 2020
IEEE Transactions on Neural Networks and Learning Systems | VOL. 33

Privacy‐enhancing machine learning framework with private aggregation of teacher ensembles
Shengnan Zhao ... Qi Zhao
International Journal of Intelligent Systems | VOL. 37
Shengnan Zhao, et. al.Shengnan Zhao ... Qi Zhao
02 Sep 2022
International Journal of Intelligent Systems | VOL. 37

Knowledge Distillation Beyond Model Compression
Fahad Sarfraz ... Bahram Zonooz
-
Fahad Sarfraz, et. al.Fahad Sarfraz ... Bahram Zonooz
10 Jan 2021
10 Jan 2021

Knowledge Distillation Beyond Model Compression

-

29 Dec 2020
29 Dec 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Data-Distortion Guided Self-Distillation for Deep Neural Networks

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence