Abstract
To overcome the shortcomings of the traditional manual detection of underwater targets in side-scan sonar (SSS) images, a real-time automatic target recognition (ATR) method is proposed in this paper. This method consists of image preprocessing, sampling, ATR by integration of the transformer module and YOLOv5s (that is, TR–YOLOv5s), and target localization. By considering the target-sparse and feature-barren characteristics of SSS images, a novel TR–YOLOv5s network and a down-sampling principle are put forward, and the attention mechanism is introduced in the method to meet the requirements of accuracy and efficiency for underwater target recognition. Experiments verified the proposed method achieved 85.6% mean average precision (mAP) and 87.8% macro-F2 score, and brought 12.5% and 10.6% gains compared with the YOLOv5s network trained from scratch, and had the real-time recognition speed of about 0.068 s per image.
Highlights
The rapid development of the marine economy and shipping business has put forward higher requirements for maritime safety
The flow chart of the proposed real-time Side-scan sonar (SSS) ATR method is shown in Figure 1, which mainly consists of sonar image preprocessing, sampling, automatic target recognition by TR–YOLOv5s and target localization
To verify the proposed real-time SSS ATR method, an SSS image set consisting of shipwrecks and submarine container targets was collected, amplified and divided into the training set, validation set and test set, and used for the detector building
Summary
The rapid development of the marine economy and shipping business has put forward higher requirements for maritime safety. Underwater maritime targets, such as shipwrecks, submerged containers, etc., will bring a threat to navigation safety, and affect the efficiency of water transportation. Side-scan sonar (SSS) can provide high-resolution images [1,2,3], which is extensively used in underwater object detection [4] and maritime search and rescue (SAR) [5]. Object detection from SSS images has mainly relied on manual visual interpretation [6], and the detection result is thereby influenced by personal quality and experience
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have