A Kind of Water Surface Multi-Scale Object Detection Method Based on Improved YOLOv5 Network

Zhongli Ma,Lili Wu,Ruojin An,Yi Wan,Jiajia Liu

doi:10.3390/math11132936

Zhongli Ma, Lili Wu + Show 3 more

Open Access

https://doi.org/10.3390/math11132936

Copy DOI

Journal: Mathematics	Publication Date: Jun 30, 2023
Citations: 3	License type: CC BY 4.0

Affiliation: Chengdu University of Information Technology

Abstract

Visual-based object detection systems are essential components of intelligent equipment for water surface environments. The diversity of water surface target types, uneven distribution of sizes, and difficulties in dataset construction pose significant challenges for water surface object detection. This article proposes an improved YOLOv5 target detection method to address the characteristics of diverse types, large quantities, and multiple scales of actual water surface targets. The improved YOLOv5 model optimizes the extraction of bounding boxes using K-means++ to obtain a broader distribution of predefined bounding boxes, thereby enhancing the detection accuracy for multi-scale targets. We introduce the GAMAttention mechanism into the backbone network of the model to alleviate the significant performance difference between large and small targets caused by their multi-scale nature. The spatial pyramid pooling module in the backbone network is replaced to enhance the perception ability of the model in segmenting targets of different scales. Finally, the Focal loss classification loss function is incorporated to address the issues of overfitting and poor accuracy caused by imbalanced class distribution in the training data. We conduct comparative tests on a self-constructed dataset comprising ten categories of water surface targets using four algorithms: Faster R-CNN, YOLOv4, YOLOv5, and the proposed improved YOLOv5. The experimental results demonstrate that the improved model achieves the best detection accuracy, with an 8% improvement in mAP@0.5 compared to the original YOLOv5 in multi-scale water surface object detection.

Full Text