Abstract

The most significant technical challenges of current aerial image object-detection tasks are the extremely low accuracy for detecting small objects that are densely distributed within a scene and the lack of semantic information. Moreover, existing detectors with large parameter scales are unsuitable for aerial image object-detection scenarios oriented toward low-end GPUs. To address this technical challenge, we propose efficient-lightweight You Only Look Once (EL-YOLO), an innovative model that overcomes the limitations of existing detectors and low-end GPU orientation. EL-YOLO surpasses the baseline models in three key areas. Firstly, we design and scrutinize three model architectures to intensify the model's focus on small objects and identify the most effective network structure. Secondly, we design efficient spatial pyramid pooling (ESPP) to augment the representation of small-object features in aerial images. Lastly, we introduce the alpha-complete intersection over union (α-CIoU) loss function to tackle the imbalance between positive and negative samples in aerial images. Our proposed EL-YOLO method demonstrates a strong generalization and robustness for the small-object detection problem in aerial images. The experimental results show that, with the model parameters maintained below 10 M while the input image size was unified at 640 × 640 pixels, the APS of the EL-YOLOv5 reached 10.8% and 10.7% and enhanced the APs by 1.9% and 2.2% compared to YOLOv5 on two challenging aerial image datasets, DIOR and VisDrone, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call