Abstract

In this paper, we propose a new lightweight Channel-Spatial Knowledge Distillation (CSKD) method to handle the task of efficient image semantic segmentation. More precisely, we investigate the KD approach that train a compressed neural network called student under the supervision of a heavy one called teacher. In this context, we propose to improve the distillation mechanism by capturing the contextual dependencies in spatial and channel dimensions through a self-attention principle. In addition, to quantify the difference between the teacher and student knowledge, we adopt the Centered Kernel Alignment (CKA) metric that avoids the student to add additional leaning layers to match the teacher features size. Experimental results over Cityscapes, CamVid and Pascal VOC datasets demonstrate that our method can achieve outstanding performance. The code is available at https://github.com/ayoubkarine/CSKD.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call