‘Parallel-Circuitized’ distillation for dense object detection

Yaoye Song,Peng Zhang,Wei Huang,Yufei Zha,Tao You,Yanning Zhang

doi:10.1016/j.displa.2023.102587

Abstract

As an effective model compression strategy, knowledge distillation allows lightweight student model to acquire knowledge from more expressive large-scale teacher model. Unfortunately, even distillation for object detection based on feature imitation is typically designed to solve the ratio imbalance of positive/negative samples, the recent dense object detection has a strong ability in this regard as well. Thus, the superposition of them leads to the law of diminishing returns, which means that the effect of such a knowledge distillation in dense object detection is not remarkable. Recent research has shown that response-based knowledge distillation schemes can overcome this limitation by directly mimicking the prediction of the teacher model, but the deficiency of attempts still limited a further progress in overall performance. By following the inspiration of analogizing the principle of parallel circuit to enhance effect of the dual-stream structured networks, in this work, a parallel knowledge distillation framework for dense object detection is proposed. Meanwhile, to further enables more reliable Localization Quality Estimation (LQE) for detection, A Soft Distribute-Guided Quality Predictor (SDGQP) is introduced for dynamical selection of distribution statistics. Additionally, with a localization quality distillation, the gap between classification and bounding box regression branch can be bridged based on more reliable localization quality score of SDGQP. Experiments on different benchmark datasets have shown that the proposed work is able to outperform other state-of-the-art dense object detection on both accuracy and robustness.

Full Text