Abstract

Knowledge distillation (KD) is a promising approach to learning compact models for object detection with information inherited from intricate teacher networks. In this paper, we raise some shortcomings of existing KD methods for object detectors, e.g., ignoring knowledge selection, coarse feature imitation mask, etc. To address these issues, a novel KD framework has been presented to train efficient object detectors via Logit Mimicking and Feature Imitation (LMFI). First, a novel logit mimicking method is put forward to distill classification and localization heads. On the one hand, it first proposes to mimic the classification logits of one-category object detectors. On the other hand, the localization knowledge from teacher predictions and ground truths are exploited, which dynamically guides the learning of student’s regression outputs by stages. Second, an adaptive positive teacher selection (APTS) strategy is designed to obtain high-quality teacher samples during distillation, which reduces the transmission of inferior knowledge. Moreover, a soft metric and a fine-grained mask are heuristically introduced to reconcile the discrepancy between teacher and student features in a position-wise manner. Extensive experiments show that LMFI outperforms the state-of-the-art KD frameworks for object detection. It can significantly boost the performance of various detectors on different benchmarks, e.g., 2.83% and 2.58% MR−2 reduction of Cascade R-CNN on the S and All sets of CityPersons, and an improvement of ResNet-50 based TOOD from 40.3% to 42.9% mAP on the COCO benchmark.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call