Modifying the softening process for knowledge distillation

Chao Tan,Jie Liu

doi:10.3233/jifs-211549

Abstract

The prime focus of knowledge distillation (KD) seeks a light proxy termed student to mimic the outputs of its heavy neural networks termed teacher, and makes the student run real-time on the resource-limited devices. This paradigm requires aligning the soft logits of both teacher and student. However, few doubts whether the process of softening the logits truly give full play to the teacher-student paradigm. In this paper, we launch several analyses to delve into this issue from scratch. Subsequently, several simple yet effective functions are devised to replace the vanilla KD. The ultimate function can be an effective alternative to its original counterparts and work well with other skills like FitNets. To claim this point, we conduct several visual tasks on individual benchmarks, and experimental results verify the potential of our proposed function in terms of performance gains. For example, when the teacher and student networks are ShuffleNetV2-1.0 and ShuffleNetV2-0.5, our proposed method achieves 40.88%top-1 error rate on Tiny ImageNet.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Modifying the softening process for knowledge distillation

Abstract

Talk to us

Similar Papers

More From: Journal of Intelligent & Fuzzy Systems

Lead the way for us

Similar Papers

A General Dynamic Knowledge Distillation Method for Visual Analytics.
Zhigang Tu ... Xuan Xiao
IEEE Transactions on Image Processing | VOL. PP
Zhigang Tu, et. al.Zhigang Tu ... Xuan Xiao
01 Jan 2021
IEEE Transactions on Image Processing | VOL. PP

Online knowledge distillation with elastic peer
Chao Tan ... Jie Liu
Information Sciences | VOL. 583
Chao Tan, et. al.Chao Tan ... Jie Liu
25 Oct 2021
Information Sciences | VOL. 583

Exploring Inter-Channel Correlation for Diversity-preserved Knowledge Distillation
Li Liu ... Bing Wang
-
Li Liu, et. al.Li Liu ... Bing Wang
01 Oct 2021
01 Oct 2021

Learning From Music to Visual Storytelling of Shots: A Deep Interactive Learning Mechanism
Jen-Chun Lin ... Wen-Li Wei
-
Jen-Chun Lin, et. al.Jen-Chun Lin ... Wen-Li Wei
12 Oct 2020
12 Oct 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Modifying the softening process for knowledge distillation

Abstract

Talk to us

Similar Papers

More From: Journal of Intelligent &amp; Fuzzy Systems

More From: Journal of Intelligent & Fuzzy Systems