Computation-Efficient Knowledge Distillation via Uncertainty-Aware Mixup

Guodong Xu,Ziwei Liu,Chen Change Loy

doi:10.1016/j.patcog.2023.109338

Abstract

Knowledge distillation (KD) has emerged as an essential technique not only for model compression, but also other learning tasks such as continual learning. Given the richer application spectrum and potential online usage of KD, knowledge distillation efficiency becomes a pivotal component. In this work, we study this little-explored but important topic. Unlike previous works that focus solely on the accuracy of student network, we attempt to achieve a harder goal – to obtain a performance comparable to conventional KD with a lower computation cost during the transfer. To this end, we present UNcertainty-aware mIXup (UNIX), an effective approach that can reduce transfer cost by 20% to 30% and yet maintain comparable or achieve even better student performance than conventional KD. This is made possible via effective uncertainty sampling and a novel adaptive mixup approach that select informative samples dynamically over ample data and compact knowledge in these samples. We show that our approach inherently performs hard sample mining. We demonstrate the applicability of our approach to improve various existing KD approaches by reducing their queries to a teacher network. Extensive experiments are performed on CIFAR100 and ImageNet. Code and model are available at https://github.com/xuguodong03/UNIXKD.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Computation-Efficient Knowledge Distillation via Uncertainty-Aware Mixup

Abstract

Talk to us

Similar Papers

More From: Pattern Recognition

Lead the way for us

Journal: Pattern Recognition	Publication Date: Jan 16, 2023
Citations: 10

Similar Papers

Continual Learning With Knowledge Distillation: A Survey.
Songze Li ... Zhongjie Wang
IEEE transactions on neural networks and learning systems | VOL. PP
Songze Li, et. al.Songze Li ... Zhongjie Wang
01 Jan 2024
IEEE transactions on neural networks and learning systems | VOL. PP

Continual Contrastive Learning for Cross-Dataset Scene Classification
Rui Peng ... Caixia Rong
Remote Sensing | VOL. 14
Rui Peng, et. al.Rui Peng ... Caixia Rong
12 Oct 2022
Remote Sensing | VOL. 14

Generous teacher: Good at distilling knowledge for student learning
Yifeng Ding ... Wencheng Yang
Image and Vision Computing | VOL. 150
Yifeng Ding, et. al.Yifeng Ding ... Wencheng Yang
03 Aug 2024
Image and Vision Computing | VOL. 150

CLASSIC: Continual and Contrastive Learning of Aspect Sentiment Classification Tasks
Zixuan Ke ... Hu Xu
-
Zixuan Ke, et. al.Zixuan Ke ... Hu Xu
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Computation-Efficient Knowledge Distillation via Uncertainty-Aware Mixup

Abstract

Talk to us

Similar Papers

More From: Pattern Recognition