Knowledge Distillation Based on Fitting Ground-Truth Distribution of Images

Jianze Li,Zhenlei Cui,Kai Chen,Zhenhua Tang

doi:10.3390/app14083284

Abstract

Knowledge distillation based on the features from the penultimate layer allows the student (lightweight model) to efficiently mimic the internal feature outputs of the teacher (high-capacity model). However, the training data may not conform to the ground-truth distribution of images in terms of classes and features. We propose two knowledge distillation algorithms to solve the above problem from the directions of fitting the ground-truth distribution of classes and fitting the ground-truth distribution of features, respectively. The former uses teacher labels to supervise student classification output instead of dataset labels, while the latter designs feature temperature parameters to correct teachers’ abnormal feature distribution output. We conducted knowledge distillation experiments on the ImageNet-2012 and Cifar-100 datasets using seven sets of homogeneous models and six sets of heterogeneous models. The experimental results show that our proposed algorithms improve the performance of penultimate layer feature knowledge distillation and outperform other existing knowledge distillation methods in terms of classification performance and generalization ability.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Apr 13, 2024
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Knowledge Distillation Based on Fitting Ground-Truth Distribution of Images

Abstract

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

Knowledge in attention assistant for improving generalization in deep teacher–student models
Sajedeh Morabbi ... Shib Sankar Sana
International Journal of Modelling and Simulation | VOL. ahead-of-print
Sajedeh Morabbi, et. al.Sajedeh Morabbi ... Shib Sankar Sana
30 Aug 2024
International Journal of Modelling and Simulation | VOL. ahead-of-print

Multi-perspective analysis on data augmentation in knowledge distillation
Wei Li ... Aiguo Song
Neurocomputing | VOL. 583
Wei Li, et. al.Wei Li ... Aiguo Song
05 Mar 2024
Neurocomputing | VOL. 583

Hybrid mix-up contrastive knowledge distillation
Jian Zhang ... Shichao Zhang
Information Sciences | VOL. 660
Jian Zhang, et. al.Jian Zhang ... Shichao Zhang
10 Jan 2024
Information Sciences | VOL. 660

Adversarial Metric Knowledge Distillation
Zihe Dong ... Junyu Dong
-
Zihe Dong, et. al.Zihe Dong ... Junyu Dong
27 Nov 2020
27 Nov 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Knowledge Distillation Based on Fitting Ground-Truth Distribution of Images

Abstract

Talk to us

Similar Papers

More From: Applied Sciences