Abstract

Unmanned aerial vehicles (UAVs) have been widely used in post-disaster search and rescue operations, object tracking, and other tasks. Therefore, the autonomous perception of UAVs based on computer vision has become a research hotspot in recent years. However, UAV images include dense objects, small objects, and arbitrary object directions, which bring about significant challenges to existing object detection methods. To alleviate these issues, we propose a global-local feature enhanced network (GLF-Net). Considering the difficulty of processing UAV images with complex scenes and dense objects, we designed a backbone based on an involution and self-attention that can extract effective features from complex objects. A multiscale feature fusion module is also proposed to address the presence of numerous small objects in UAV images through multiscale object detection and feature fusion. To accurately detect rotated objects, a rotated regional proposal network was designed based on the midpoint offset representation, which can apply a rotated box to determine the real direction and contour of an object. GLF-Net achieves a state-of-the-art detection accuracy (86.52% mAP) on our created RO-UAV dataset, while achieving 96.95% and 97% mAP on the public datasets HRSC2016 and UCAS-AOU, respectively. The experimental results demonstrate that our method achieves a high detection accuracy and generalization, which can meet the practical requirements of UAVs under various complex scenarios.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call