Channel-spatial knowledge distillation for efficient semantic segmentation

Ayoub Karine,Thibault Napoléon,Maher Jridi

doi:10.1016/j.patrec.2024.02.027

Abstract

In this paper, we propose a new lightweight Channel-Spatial Knowledge Distillation (CSKD) method to handle the task of efficient image semantic segmentation. More precisely, we investigate the KD approach that train a compressed neural network called student under the supervision of a heavy one called teacher. In this context, we propose to improve the distillation mechanism by capturing the contextual dependencies in spatial and channel dimensions through a self-attention principle. In addition, to quantify the difference between the teacher and student knowledge, we adopt the Centered Kernel Alignment (CKA) metric that avoids the student to add additional leaning layers to match the teacher features size. Experimental results over Cityscapes, CamVid and Pascal VOC datasets demonstrate that our method can achieve outstanding performance. The code is available at https://github.com/ayoubkarine/CSKD.

Full Text