Abstract

To address the challenges of detecting a large number of objects and a high proportion of small objects in aerial drone imagery, we proposed an aerial dense small object detection algorithm called Global Normalization Attention Mechanism You Only Look Once (GNYL) based on the Global Normalization Attention Mechanism. In the backbone network of GNYL, we embedded a GNAM (Global Normalization Attention Mechanism) that explores channel attention features and spatial attention features from input features in a concatenated manner. It utilizes batch normalization’s scale factors to suppress irrelevant channels or pixels. Furthermore, the spatial attention sub-module introduces a three-dimensional arrangement with a multi-layer perceptron to reduce information loss and amplify global interaction representation. Finally, the computed attention weights are weighted to form the global normalized attention weights, which increases the utilization of effective information in input feature channels and spatial dimensions. We have optimized the backbone network, feature enhancement network, and detection heads to improve detection accuracy while ensuring a lightweight detection network. Specifically, we have added a small object detection layer to enhance the localization accuracy for the abundant small objects in aerial imagery. The algorithm’s performance was evaluated using the publicly available VisDrone2019 dataset. Compared to the baseline network YOLOv8l, GNYL achieved a 7.2% improvement in mAP0.5 and a 5.0% improvement in mAP0.95. Compared to CDNet, GNYL showed a 14.5% improvement in mAP0.5 and a 9.1% improvement in mAP0.95. These experimental results demonstrate the strong practicality of the GNYL object detection network for detecting dense small objects in the aerial imagery captured by unmanned aerial vehicles.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call