Abstract
Convolutional neural networks (CNN) have a significant improvement in the accuracy of object detection. As networks become deeper, the precision of detection becomes obviously improved, and more floating-point calculations are also needed. Because of the great amount of calculation, it is inconvenient for mobile and embedded vision applications. Many researchers apply the knowledge distillation method to improve the precision of object detection by transferring knowledge from a deeper and larger teachers network to a small student one. Most methods of knowledge distillation are needed to design complex cost functions and mainly aim at the two-stage object detection algorithm. Therefore, we propose a clean and effective knowledge distillation method called Generative Adversarial Networks - Knowledge Distillation(GAN-KD) for the one-stage object detection. The feature maps generated by teacher network and student network are employed as true and fake samples respectively, and generating adversarial training for both of them to improve the performance of the student network in one-stage object detection. The experimental result shows that our approach achieves the performance gain of 5% mAP when compared with MobilenetV1 on COCO dataset.
Highlights
In recent years, Convolutional Neural Networks (CNN) have become ubiquitous in computer vision
In order to have an effective way of knowledge distillation, we referred to the architecture of the GENERATIVE ADVERSARIAL NETWORKS (GAN) [14] to take the feature map generated by the teacher network and the student network as true samples and fake samples respectively
At present, most of the knowledge distillation methods are aimed at the two-stage object detection
Summary
Convolutional Neural Networks (CNN) have become ubiquitous in computer vision. This paper proposes a novel and effective knowledge distillation neural network architecture, which can obviously improve the performance of student net in one-stage object detection. [12] distilled the two-stage object detection, they extracted the middle feature map of teacher net and the dark knowledge of R-CNN respectively to train the student net. In order to have an effective way of knowledge distillation, we referred to the architecture of the GAN [14] to take the feature map generated by the teacher network and the student network as true samples and fake samples respectively. We designed a neural network as a discriminator and applied true samples and fake samples to conduct the generative adversarial training to improve the performance of the student network in one-stage object detection. D represents the discriminant network, Teacher and Student represent teacher net and student net, respectively
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.