Abstract

In the practical scene, object detection faces a very complicated situation. The occlusion problem always occurs in actual scene, which may affect the accuracy of object detection, especially for the occluded objects. For the deep models, a larger dataset with sufficient occlusion samples will improve the performance of the object detection models. However, the sample with occlusion problem is too hard to obtain. Therefore, a global average pooling(GAP) based adversarial Faster-RCNN is proposed to generate the hard samples and enhance the performance of object detection algorithm. Sufficient hard samples can be generated with the help of this model. Therefore, the object detection model can be trained adequately for the occluded objects. The hard sample generation is carried out in the space of image feature instead of image generation directly. The class-dependent part is obtained by the GAP network, and it is obscured to generate the feature map of hard sample for model reinforcement training. Therefore, the better object detection model can be trained using a conventional dataset. The Faster-RCNN is adopted as the baseline. The Faster-RCNN and GAP have a joint training to improve the performance of the proposed model. The simulation results exhibit the validation of the proposed algorithm.

Highlights

  • Object detection is a hot topic in visual perception, which provides the information for the image and video understanding [1], [2]

  • In order to exhibit the effect of the fusion of global average pooling (GAP) and Faster-RCNN intuitively, all the ωkc corresponding to the detected categories are gathered, and the weighted sum of them and the feature maps of corresponding proposal region are obtained

  • The first column is the input image with bounding box, the second column is the proposal region with Class Activation Mapping (CAM) indication, the third column is the feature map of proposal region in Faster-RCNN fused with GAP and the highlight part is the class-dependent part

Read more

Summary

INTRODUCTION

Object detection is a hot topic in visual perception, which provides the information for the image and video understanding [1], [2]. This paper adopts the method of processing the feature of convolutional neural network to obtain hard samples and strengthen the training of the object detection model instead of relying on hard sample image generation. In order to exhibit the effect of the fusion of GAP and Faster-RCNN intuitively, all the ωkc corresponding to the detected categories are gathered, and the weighted sum of them and the feature maps of corresponding proposal region are obtained. In figure 4, the first column is the input image with bounding box, the second column is the proposal region with CAM indication, the third column is the feature map of proposal region in Faster-RCNN fused with GAP and the highlight part is the class-dependent part.

EVALUATION METRICS
ABLATION STUDY
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.