Abstract

Security, transportation, and rescue applications require fully analyzing the visual data interpretation via drone platforms. While various aspects of object detection research are expanding at a rapid pace, the detection of small objects in drone platforms continues to pose significant challenges. Specifically, targets in drone-captured scenarios are notoriously hard to detect due to factors such as low resolution, feature indistinguishability, occlusion, scale variation, among others. To address this, our work provides abundant object-oriented information to enhance the recognition ability of small objects in large scale and is structured as follows: Firstly, we devise a method for small object detection in UAV captured scenes, incorporating enhanced object-oriented information. This involves an improvement of the fundamental convolutional feature transformation to yield more discriminative contexts for small objects. Secondly, we represent the input tokens of the vision Multilayer Perceptron (MLP) as a wave function. In order to acquire more effective global representation, we compute the amplitude and phase of these tokens using a local-maximum approach, facilitating dynamic aggregation tailored to the object’s unique semantic information. Lastly, we propose a fusion method transitioning from local to global, devised for comprehensive learning of object features. Experimental evidence substantiates the efficacy of our model, achieving a mean Average Precision (mAP) of 30.6% on the VisDrone dataset. This precision, maintained with consistent input size, outperforms other state-of-the-art methods, underscoring our model’s reliability for drone-captured small object detection.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call