Abstract

Self-Knowledge Distillation (Self-KD), a technique that enables neural networks to learn from themselves, often relies on auxiliary modules or networks to generate supervisory signals for training. However, this approach incurs significant additional resource costs. Moreover, incorporating auxiliary classifiers within the network architecture creates a capacity mismatch when distilling knowledge from deep to shallow classifiers. This paper proposes a concise and efficient Self-KD method called Neighbor Self-Knowledge Distillation (NSKD), which introduces teacher assistants into the Self-KD by adding auxiliary classifiers to the shallow part of the network to construct distillations of multiple neighboring student-teacher assistant combinations to reduce the mismatch between the students' and teachers' abilities. During distillation, NSKD utilizes only the soft labels generated by each classifier and corresponding ground truth labels as supervisory signals, minimizing resource consumption. NSKD enables neighboring modules to learn from each other through neighboring distillation, enhancing overall network performance. Experimental results on five network models and seven popular datasets demonstrate the superiority of NSKD over other state-of-the-art Self-KD methods. Notably, NSKD achieves average accuracy improvements of 2.26%, 2.32%, and 2.4% on the CIFAR100, TinyImageNet, and fine-grained visual classification datasets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.