Abstract

This paper aims to propose a convolutional neural network that can accurately estimate the density of pedestrians in crowd images, and analyze the primary factors that affect the performance of the network employed in crowd counting. In contrast to the multi-column convolutional neural network (MCNN) which is mainly used in previous works, the proposed network is designed by using the improved Inception-ResNet-A module in the manner of end-to-end and therefore it is convenient for training and easy to be employed. Experiments have been conducted on four datasets: Mall dataset, UCSD dataset, ShanghaiTech dataset and UCF_CC_SO dataset. Besides, the performances of different convolutional neural network (CNN) architectures have been compared to prove the effectiveness of the proposed network. The experimental results show that the proposed network is more accurate and efficient against the existing state-of-art methods, overcoming the problems generated by scale variation, pedestrian occlusion and appearance change. To the best of our knowledge, the proposed network is the first network to use the Inception-ResNet-A structure for crowd counting.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call