Abstract

Knowledge distillation improves the performance of a compact student network by adding supervision from a pre-trained cumbersome teacher network during training. To avoid the resource consumption of acquiring an extra teacher network, the self-knowledge distillation designs a multi-branch network architecture with shared layers for teacher and student models, which are trained collaboratively in a one-stage manner. However, this method ignores the knowledge of shallow branches and rarely provides diverse knowledge for effective collaboration of different branches. To solve these two shortcomings, this paper proposes a novel Diversified Branch Fusion approach for Self-Knowledge Distillation (DBFSKD). Firstly, we design lightweight networks for adding to the middle layers of the backbone. They capture discriminative information by global–local attention. Then we introduce a diversity loss between different branches to explore diverse knowledge. Moreover, the diverse knowledge is further integrated to form two knowledge sources by a Selective Feature Fusion (SFF) and a Dynamic Logits Fusion (DLF). Thus, the significant knowledge of shallow branches is efficiently utilized and all branches learn from each other through the fused knowledge sources. Extensive experiments with various backbone structures on four public datasets (CIFAR100, Tiny-ImageNet200, ImageNet, and RAF-DB) show superior performance of the proposed method over other methods. More importantly, the DBFSKD achieves even better performance with fewer resource consumption than the baseline.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.