Automatic Ship Object Detection Model Based on YOLOv4 with Transformer Mechanism in Remote Sensing Images

Bowen Sun,Feng Dong,Amjad Pervez,Xiaofeng Wang,Ammar Oad

doi:10.3390/app13042488

Bowen Sun, Feng Dong + Show 3 more

Open Access

https://doi.org/10.3390/app13042488

Copy DOI

Abstract

Despite significant advancements in object detection technology, most existing detection networks fail to investigate global aspects while extracting features from the inputs and cannot automatically adjust based on the characteristics of the inputs. The present study addresses this problem by proposing a detection network consisting of three stages: preattention, attention, and prediction. In the preattention stage, the network framework is automatically selected based on the features of the images’ objects. In the attention stage, the transformer structure is introduced. Taking into account the global features of the target, this study combines a self-attention module in the transformer model and convolution operation to integrate image features from global to local and for detection, thus improving the ship target accuracy. This model uses mathematical methods to obtain results of predictive testing in the prediction stage. The above improvements are based on the You Only Look Once version 4 (YOLOv4) framework, named “Auto-T-YOLO”. The model achieves the highest accuracy of 96.3% on the SAR Ship Detection dataset (SSDD) compared to the other state-of-the-art (SOTA) model. It achieves 98.33% and 91.78% accuracy in the offshore and inshore scenes, respectively. The experimental results verify the practicality, validity, and robustness of the proposed model.

Full Text