Abstract
Multi-scale object detection is a preeminent challenge in computer vision and image processing. Several deep learning models that are designed to detect various objects miss out on the detection capabilities for small objects, reducing their detection accuracies. Intending to focus on different scales, from extremely small to large-sized objects, this work proposes a Spatially Dilated Multi-Scale Network (SDMNet) architecture for UAV-based ground object detection. It proposes a Multi-scale Enhanced Effective Channel Attention mechanism to preserve the object details in the images. Additionally, the proposed model incorporates dilated convolution, sub-pixel convolution, and additional prediction heads to enhance object detection performance specifically for aerial imaging. It has been evaluated on two popular aerial image datasets, VisDrone 2019 and UAVDT, containing publicly available annotated images of ground objects captured from UAV. Different performance metrics, such as precision, recall, mAP, and detection rate, benchmark the proposed architecture with the existing object detection approaches. The experimental results demonstrate the effectiveness of the proposed model for multi-scale object detection with an average precision score of 54.2% and 98.4% for VisDrone and UAVDT datasets, respectively.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.