Online Subclass Knowledge Distillation

Maria Tzelepi,Nikolaos Passalis,Anastasios Tefas

doi:10.1016/j.eswa.2021.115132

Abstract

Knowledge Distillation has been established as a highly promising approach for training compact and faster models by transferring knowledge from more heavyweight and powerful models, so as to satisfy the computation and storage requirements of deploying state-of-the-art deep neural models on embedded systems. However, conventional knowledge distillation requires multiple stages of training rendering it a computationally and memory demanding procedure. In this paper, a novel single-stage self knowledge distillation method is proposed, namely Online Subclass Knowledge Distillation (OSKD), that aims at revealing the similarities inside classes, improving the performance of any deep neural model in an online manner. Hence, as opposed to existing online distillation methods, we are able to acquire further knowledge from the model itself, without building multiple identical models or using multiple models to teach each other, rendering the OSKD approach more effective. The experimental evaluation on five datasets indicates that the proposed method enhances the classification performance, while comparison results against existing online distillation methods validate the superiority of the proposed method.

Full Text