Abstract

As a common method of model compression, the knowledge distillation method can distill the knowledge from the complex large model with strong learning ability to student small model with weak learning ability in the training process, to improve the accuracy and performance of the small model. At present, there has been much knowledge distillation methods specially designed for object detection and achieved good results. However, almost all methods failed to solve the problem of performance degradation caused by the high noise in the current detection framework. In this study, we proposed a feature automatic weight learning method based on EMD to solve these two problems. That is, the EMD method is used to process the space vector to reduce the impact of negative transfer and noise as much as possible, and at the same time, the weights are allocated adaptive to reduce student model’s learning from the teacher model with poor performance and make students more inclined to learn from good teachers. The loss (EMD Loss) was redesigned, and the HEAD was improved to fit our approach. We have carried out different comprehensive performance tests on multiple datasets, including PASCAL, KITTI, ILSVRC, and MS-COCO, and obtained encouraging results, which can not only be applied to the one-stage and two-stage detectors but also can be used radiatively with some other methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.