Abstract

Aerial object detection is crucial in various computer vision tasks, including video monitoring, early warning systems, and visual tracking. While current methods can accurately detect normal-sized objects, they face challenges distinguishing small objects from cluttered backgrounds. Developing methods that can be deployed on edge devices to achieve fast, accurate, and energy-efficient performance is also an urgent challenge. This paper proposes a network for aerial object detection by incorporating an attention mechanism to enhance feature extraction and elevate the accuracy of aerial moving object detection. Additionally, we optimize the channel dimensions of the feature extraction framework, resulting in a reduction in model parameters, acceleration of inference speed, and alleviation of the computational burden. Ulteriorly, we optimize the Spatial Pyramid Pooling (SPP) module to enhance detection accuracy and processing speed. Inspired by the ResNet and RepVGG structures, we design a feature fusion module to combine early-extracted features, improving speed and accuracy. Based on the design mentioned above principles, we develop a neural network method with an impressively small model size of only [Formula: see text][Formula: see text]M. The proposed approach achieves state-of-the-art performance on five benchmark datasets. Besides its superior performance, our method demonstrates excellent throughput on edge computing devices. Experimental results show that even when running on low-performance computing devices, the CPU and GPU temperatures remain below 50∘C and achieve a detection speed of 14.8 frames per second (fps) and power consumption of only 2.9[Formula: see text]W. These findings suggest that a high-accuracy, low-power, low-latency, and low-memory footprint aerial object detection solution is achievable.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call