Abstract

This paper proposes a practical end-to-end neural network framework to detect tiny moving vehicles in satellite videos with low imaging quality. Some instability factors such as illumination changes, motion blurs, and low contrast to the cluttered background make it difficult to distinguish true objects from noise and other point-shaped distractors. Moving vehicle detection in satellite videos can be carried out based on background subtraction or frame differencing. However, these methods are prone to produce lots of false alarms and miss many positive targets. Appearance-based detection can be an alternative but is not well-suited since classifier models are of weak discriminative power for the vehicles in top view at such low resolution. This article addresses these issues by integrating motion information from adjacent frames to facilitate the extraction of semantic features and incorporating the Transformer to refine the features for key points estimation and scale prediction. Our proposed model can well identify the actual moving targets and suppress interference from stationary targets or background. The experiments and evaluations using satellite videos show that the proposed approach can accurately locate the targets under weak feature attributes and improve the detection performance in complex scenarios.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call