ABSTRACT Object detection on drone-view images is vital for applications like intelligent transportation, abnormal behavior detection, and urban surveillance. However, the diverse perspectives and altitudes from which Unmanned Aerial Vehicle (UAV) capture scenes pose challenges due to multi-scale differences and small object sizes. To address this, we propose CSSDet, emphasizing multi-scale object detection and small object enhancement. CSSDet integrates an additional detection head for small objects and employs a bidirectional weighted feature pyramid network for effective cross-scale feature fusion, enhancing global perceptual capability. Additionally, we introduce coordinate attention and Involution modules to mitigate information loss. Experiments are conducted on two publicly available datasets, VisDrone and UAVDT. Quantitative results from the experiments indicate that the proposed CSSDet achieves optimal performance across a wide range of evaluation metrics, with particular strengths demonstrated in multi-scale object detection and small-size object detection. Qualitative results showcase its remarkable effectiveness in various aspects of object detection, including small object detection, objects at different scales and categories, multi-perspective scenarios, and complex scenes. The substantial experimental findings confirm that CSSDet attains comprehensive object detection performance on drone-view images. The source code is available at https://github.com/Gui-Cheng/CSSDet.
Read full abstract