Abstract

Most of knowledge distillation methods for object detection are feature-based and have achieved competitive results. However, only distillating in feature imitation part does not take full advantage of more sophisticated detection head design for object detection, especially dense object detection. In this paper, a triple parallel distillation (TPD) is proposed which can efficiently transfer all the output response in detection head from teacher to student. Moreover, to overcome the drawback of simply combining the feature-based with the response-based distillation with limited effect enhancement. A hierarchical re-weighting attention distillation (HRAD) is proposed to make student learn more than the teacher in feature information, as well as reciprocal feedback between the classification-IoU joint representation of detection head and the attention-based feature. By jointly interacting the benefits of TPD and HRAD, a closed-loop unified knowledge distillation for dense object detection is proposed, which makes the feature-based and response-based distillation unified and complementary. Experiments on different benchmark datasets have shown that the proposed work is able to outperform other state-of-the-art distillation methods for dense object detection on both accuracy and robustness.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call