Abstract

A multi-scale UAV aerial image object detection model MS-YOLOv7 based on YOLOv7 was proposed to address the issues of a large number of objects and a high proportion of small objects that commonly exist in the Unmanned Aerial Vehicle (UAV) aerial image. The new network is developed with a multiple detection head and a CBAM convolutional attention module to extract features at different scales. To solve the problem of high-density object detection, a YOLOv7 network architecture combined with the Swin Transformer units is proposed, and a new pyramidal pooling module, SPPFS is incorporated into the network. Finally, we incorporate the SoftNMS and the Mish activation function to improve the network’s ability to identify overlapping and occlusion objects. Various experiments on the open-source dataset VisDrone2019 reveal that our new model brings a significant performance boost compared to other state-of-the-art (SOTA) models. Compared with the YOLOv7 object detection algorithm of the baseline network, the mAP0.5 of MS-YOLOv7 increased by 6.0%, the mAP0.95 increased by 4.9%. Ablation experiments show that the designed modules can improve detection accuracy and visually display the detection effect in different scenarios. This experiment demonstrates the applicability of the MS-YOLOv7 for UAV aerial photograph object detection.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call