Abstract

Unmanned aerial vehicles are an essential component in the realization of Industry 4.0. With drones helping to improve industrial safety and efficiency in utilities, construction, and communication, there is an urgent need for drone-based intelligent applications. In this paper, we develop a unified framework to simultaneously detect and count vehicles from drone images. We first explore why the state-of-the-art detectors fail in highly dense drone scenes, which provides more appropriate insights. Then, we propose an effective loss to push the anchors toward matching the ground-truth boxes as much as possible, specifically designed for scale-adaptive anchor generation. Inspired by attention mechanisms in the human visual system, we maximize the mutual information between object classes and features by combining bottom-up cues with top-down attention mechanisms specifically designed for feature extraction. Finally, we build a counting layer with a regularized constraint related to the number of vehicles. Extensive experiments demonstrate the effectiveness of our approach. For both tasks, our proposed method achieves state-of-the-art results on all four challenging datasets. In particular, our results reduce error by a larger factor than previous methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call