Abstract
In this paper, we propose a new lightweight Channel-Spatial Knowledge Distillation (CSKD) method to handle the task of efficient image semantic segmentation. More precisely, we investigate the KD approach that train a compressed neural network called student under the supervision of a heavy one called teacher. In this context, we propose to improve the distillation mechanism by capturing the contextual dependencies in spatial and channel dimensions through a self-attention principle. In addition, to quantify the difference between the teacher and student knowledge, we adopt the Centered Kernel Alignment (CKA) metric that avoids the student to add additional leaning layers to match the teacher features size. Experimental results over Cityscapes, CamVid and Pascal VOC datasets demonstrate that our method can achieve outstanding performance. The code is available at https://github.com/ayoubkarine/CSKD.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have