Knowledge distillation has been widely applied in semantic segmentation to reduce the model size and computational complexity. The prior knowledge distillation methods for semantic segmentation mainly focus on transferring the spatial relation knowledge, neglecting to transfer the channel correlation knowledge in the feature space, which is vital for semantic segmentation. We propose a novel Channel Correlation Distillation (CCD) method for semantic segmentation to solve this issue. The correlation between channels tells how likely these channels belong to the same categories. We force the student to mimic the teacher by minimizing the distance between the channel correlation maps of the student and the teacher. Furthermore, we propose the multi-scale discriminators to sufficiently distinguish the multi-scale differences between the teacher and student segmentation outputs. Extensive experiments on three popular datasets: Cityscapes, CamVid, and Pascal VOC 2012 validate the superiority of our CCD. Experimental results show that our CCD could consistently improve the state-of-the-art methods with various network structures for semantic segmentation.