Abstract
Object detection by shipborne unmanned aerial vehicles (UAVs) equipped with electro-optical (EO) sensors plays an important role in maritime rescue and ocean monitoring. However, high-precision and low-latency maritime environment small-object-detection algorithms remain a major challenge. To address this problem, this paper proposes the YOLO-BEV (“you only look once”–“bird’s-eye view”) model. First, we constructed a bidirectional feature fusion module—that is, PAN+ (Path Aggregation Network+)—adding an extremely-small-object-prediction head to deal with the large-scale variance of targets at different heights. Second, we propose a C2fSESA (Squeeze-and-Excitation Spatial Attention Based on C2f) module based on the attention mechanism to obtain richer feature information by aggregating features of different depth layers. Finally, we describe a lightweight spatial pyramid pooling structure called RGSPP (Random and Group Convolution Spatial Pyramid Pooling), which uses group convolution and random channel rearrangement to reduce the model’s computational overhead and improve its generalization ability. The article compares the YOLO-BEV model with other object-detection algorithms on the publicly available MOBDrone dataset. The research results show that the mAP0.5 value of YOLO-BEV reached 97.1%, which is 4.3% higher than that of YOLOv5, and the average precision for small objects increased by 22.2%. Additionally, the YOLO-BEV model maintained a detection speed of 48 frames per second (FPS). Consequently, the proposed method effectively balances the accuracy and efficiency of object-detection in shipborne UAV scenarios, outperforming other related techniques in shipboard UAV maritime object detection.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.