Abstract
To detect and recognize small-size and submerged complex background targets in infrared images, we combine a dynamic receptive field fusion strategy and a multi-scale feature fusion mechanism to improve the detection performance of small targets significantly. The space-to-depth convolution module is introduced as a downsampling layer in the backbone first and achieves the same sampling effect. More detailed information is retained at the same time. Thus, the model’s detection capability for small targets has been enhanced. Then, the pyramid level 2 feature map with minimum receptive field and maximum resolution is added to the neck, which reduces the loss of positional information during feature sampling. Furthermore, x-small detection heads are added, the understanding of the overall characteristics and structure of the target is enhanced much more, and the representation and localization of small targets have been improved. Finally, the cross-entropy loss function in the original network model is replaced by an adaptive threshold focal loss function, forcing the model to allocate more attention to target features. The above methods are based on a public tool, the eighth version of You Only Look Once (YOLO) improved, it is named SPT–YOLO (SPDConv + P2 + Adaptive Threshold + YOLOV8s) in this paper. Some experiments on datasets such as infrared small object detection (IR-SOD) and infrared small target detection 1K(IRSTD-1K), etc. have been executed to verify the proposed algorithm; and the mean average precision of 94.0% and 69% under the condition of threshold at 0.5 and over a range from 0.5 to 0.95 is obtained, respectively. The results show that the proposed method achieves the best performance of infrared small target detection compared to existing methods.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have