Abstract

In recent years, crowd counting has been widely used in computer vision. But on many issues like large scale variations, perspective distortions and complex backgrounds, it remains a challenging task. Previous methods usually adopted dilation convolution to enlarge the receptive field, these methods can generate poor density maps. To solve this problem, in this paper, we proposed a novel structure called Attention-based Contextual Convolution Network for crowd counting. ACCNet contains two components. The Contextual Convolution Network first gets the relative influence of different scale-aware feature and attention module to further enlarge the range of scales while getting the final crowd density map in the network. The key section of our network is the dense dilated module, this method can make the front and back dilation layers tightly connected to preserve information from changing scales. Other than that, attention mechanism also plays an important role in learning density maps from input images. To verify the feasibility of our method, we compare it against the state-of-the-art structure on four crowd counting datasets (ShanghaiTech Part_A, ShanghaiTech Part_B, WorldEXPO'10 and UCF_CC_50), which demonstrates that ACCNet is effective and robust for complex scenes.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call