Abstract

ABSTRACT Research on knowledge distillation has become active in deep neural networks. Knowledge distillation involves training a low-capacity model from a high-capacity model. However, when the capacities of the teacher and student models differ, it can result to poor learning and low generalization performance. We propose here a novel teacher assistant model called Knowledge in Attention Assistant. This model learns learning a discriminative representation of important regions and statistical information along with spatial and channel knowledge. Moreover, by using a triplet attention mechanism, the student model can learn both the inner and outer distribution of different categories, and also memorize the knowledge distribution of the teacher model. This alignment improves the effectiveness and generalization of knowledge distillation and reduces the capacity gap between the teacher and student models. The present model addresses feature inconsistency by adjusting the attention weight distribution based on the resemblance between the features of the teacher and student. The evaluation of the proposed teacher assistant method shows remarkable results. The student model outperforms the teacher model in terms of generalization performance, achieving improvements of 93.37% and 94.09% on CIFAR-10 and CIFAR-100 datasets, respectively. Furthermore, the proposed model enhances the F1-scores 91.98% on CIFAR-10 and 79.69% on CIFAR-100.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.