Balanced self-distillation for long-tailed recognition

Ning Ren,Xiaosong Li,Yanxia Wu,Yan Fu

doi:10.1016/j.knosys.2024.111504

Abstract

In long-tailed recognition tasks, the knowledge distillation technology is widely adopted for improving performance of deep neural networks. These methods distill the knowledge from the pretrained teacher model to the student model, which enables higher long-tailed recognition accuracy. However, the dependence on accompanying assistive models complicates the single network’s training process in the need for large memory and time costs. In this work, we present Balanced Self-Distillation (BSD) to distill tail knowledge by a single network without the assistive models. Specifically, BSD distills knowledge between different distortions of the same samples to stimulate the representation learning potential of the single network and adopts a balanced class weight for shifting the distillation focus from head-to-tail classes. Comprehensive experimentation across diverse datasets, including CIFAR-10-LT, CIFAR-100-LT and TinyImageNet-LT, consistently outperforms robust baseline methods. Specifically, BSD achieves improvements of 8.13% on CIFAR-100-LT with an imbalance ratio of 100 compared to the baseline (cross entropy). Furthermore, the proposed method enables seamless integration with contemporary techniques like re-sampling, meta-learning, and cost-sensitive learning. It emerges as a versatile tool capable of effectively addressing the challenges of long-tailed scenarios.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Balanced self-distillation for long-tailed recognition

Abstract

Talk to us

Similar Papers

More From: Knowledge-Based Systems

Lead the way for us

Journal: Knowledge-Based Systems	Publication Date: Feb 21, 2024
Citations: 1

Similar Papers

Effects of Elaborations Included in Textbooks: Large Time Cost, Reduced Attention, and Lower Memory for Main Ideas
Nola Daley ... Katherine A Rawson
Educational Psychology Review | VOL. 33
Nola Daley, et. al.Nola Daley ... Katherine A Rawson
17 Sep 2020
Educational Psychology Review | VOL. 33

Data-Distortion Guided Self-Distillation for Deep Neural Networks
Ting-Bing Xu ... Cheng-Lin Liu
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 33
Ting-Bing Xu, et. al.Ting-Bing Xu ... Cheng-Lin Liu
17 Jul 2019
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 33

Optimized large vocabulary WFST speech recognition system
Yuhong Guo ... Jielin Pan
-
Yuhong Guo, et. al.Yuhong Guo ... Jielin Pan
01 May 2012
01 May 2012

The Impact of Network Competition in the Mobile Industry
George Houpis ... Tom Ovington
Competition and Regulation in Network Industries | VOL. 17
George Houpis, et. al.George Houpis ... Tom Ovington
01 Mar 2016
Competition and Regulation in Network Industries | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Balanced self-distillation for long-tailed recognition

Abstract

Talk to us

Similar Papers

More From: Knowledge-Based Systems