Abstract

Crowd counting has played a substantial role in intelligent surveillance. This work presents a multi-scale multi-task convolutional neural network (MSMT-CNN) to estimate accurate density maps, thus can count the crowd through summing up all values in the estimated density maps. The ground truth density maps used for training are generated by a novel adaptive human-shaped kernel. In addition to resolving the scale problem with the multi-scale strategy, the multi-task learning strategy is added so as to make the estimated density maps more accurate. A weighted loss function is proposed to enhance the activations in dense regions and suppress the background noise. Experimental results on two benchmarking datasets reveal the strong ability of MSMT-CNN. Compared with existing crowd counting methods, the root mean squared error is decreased by 39.8 on the UCF_CC_50 dataset, and the mean absolute error is decreased by 2.3 on the World Expo’10 dataset. Furthermore, the evaluations in practical bus videos verify the practicability of our MSMT-CNN.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call