Abstract Target detection from the aerial perspective of drones plays a crucial role in various fields. However, due to its unique high-altitude overhead view, images captured often exhibit a high proportion of small-sized targets amidst complex backgrounds and varying scales, posing significant challenges for detection. To address these issues, the EDR-YOLOv8 model has been proposed for drone-based aerial target detection. Firstly, the backbone of YOLOv8l is replaced with the high-resolution visual module EfficientViT, reducing the parameter count while maintaining the model's capability to express important features. Secondly, the feature fusion network is redesigned with a four-level prediction layer to enhance the detection accuracy of small-sized targets. Additionally, the lightweight dynamic upsampler DySample is introduced to preserve more detailed target information. Finally, we design the feature fusion module C2f_RepGhost, which integrates the RepGhost bottleneck structure with YOLOv8's C2f, thereby reducing computational complexity. Experimental results demonstrate that EDR YOLOv8 achieves a-4.1% higher mAP@0.5 compared to the baseline YOLOv8l on the VisDrone2019-DET dataset, with a reduction of 40.5% in model size and 42.0% in parameter count. This illustrates that EDR-YOLOv8 achieves both lightweight modeling and improved detection accuracy.
Read full abstract