Abstract

Deep learning methods have significantly improved object detection performance, but small object detection remains an extremely difficult and challenging task in computer vision. We propose a feature fusion and spatial attention-based single shot detector (FASSD) for small object detection. We fuse high-level semantic information into shallow layers to generate discriminative feature representations for small objects. To adaptively enhance the expression of small object areas and suppress the feature response of background regions, the spatial attention block learns a self-attention mask to enhance the original feature maps. We also establish a small object dataset (LAKE-BOAT) of a scene with a boat on a lake and tested our algorithm to evaluate its performance. The results show that our FASSD achieves 79.3% mAP (mean average precision) on the PASCAL VOC2007 test with input 300 × 300, which outperforms the original single shot multibox detector (SSD) by 1.6 points, as well as most improved algorithms based on SSD. The corresponding detection speed was 45.3 FPS (frame per second) on the VOC2007 test using a single NVIDIA TITAN RTX GPU. The test results of a simplified FASSD on the LAKE-BOAT dataset indicate that our model achieved an improvement of 3.5% mAP on the baseline network while maintaining a real-time detection speed (64.4 FPS).

Highlights

  • In standard datasets, large and medium objects usually occupy a larger proportion than small objects

  • The initialization parameters of the backbone were derived from a well pre-trained VGG16 on ImageNet, and Xavier initialization was applied to the remaining layers

  • The results indicate that our feature fusion block and spatial attention block perform well in enhancing the shallow layers

Read more

Summary

Introduction

Large and medium objects usually occupy a larger proportion than small objects. Small objects may carry crucial information, and small object detection has great application potential. It contributes to finding mild illness before a disease intensifies. In traffic management, it improves the monitoring accuracy of video monitoring systems for traffic flow, providing assistance for vehicle management. The preliminary detection of distant vehicles, signal lights, and signs is helpful in expanding the perception range of the visual system and preparing responses in advance. The importance of small object detection is clearly highlighted in the field of remote sensing image analysis, which is inherently associated with long-distance imaging

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.