Abstract

With the increasing popularity of unmanned aerial vehicles (UAVs), accurate positioning and pose recognition of UAVs by target images based on photoelectric detection become a research hotspot. To solve this issue, a multi-scale UAV-Pose dataset consisting of 1400 UAV images is contributed in this paper. In addition, a balanced and enhanced network (UAVPNet) is proposed. UAVPNet has two major features: (1) balanced feature pyramid (BFP) feature fusion structure to improve unbalanced multi-scale features; (2) VarifocalNet detection head to alleviate the foreground-background imbalance. A comparative study demonstrates that UAVPNet is superior to some state-of-the-art object detection models (such as Faster R-CNN-CARAFE, and Yolov8, etc.) in terms of detection accuracy and robustness. Specifically, UAVPNet achieves state-of-the-art 0.885 mAP on the newly-created UAV-Pose dataset, together with nearly 33.49 M parameters, 139.9G FLOPs, and 9.8 FPS. It could fully fulfill the requirements of UAV positioning and pose recognition in the intricate environment.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call