A New Framework For Crowded Scene Counting Based On Weighted Sum Of Regressors and Human Classifier

Phuc Thinh Do,Ngoc Quoc Ly

doi:10.1145/3287921.3287980

Abstract

Crowd density estimation is an important task in the surveillance camera system, it serves in security, traffic, business etc. At the present, the trend of monitoring is moving from individual to crowd, but traditional counting techniques will be inefficient in this case because of issues such as scale, clutter background and occlusion. Most of the previous methods have focused on modeling work to accurately estimate the density map and thus infer the count. However, with non-human scenes, which have many clouds, trees, houses, seas etc, these models are often confused, resulting in inaccurate count estimates. To overcome this problem, we propose the Weighted Sum of Regressors and Human Classifier (WSRHC) method. Our model consists of two main parts: human -- non-human classification and crowd counting estimation. First of all, we built a Human Classifier, which filters out negative sample images (non-human images) before entering into the regressors. Then, the count estimation is based on the regressors. The difference between regressors is the size of the filters. The essence of this method is the count depends on the weighted average of the density map obtained from these regressors. This is to overcome the defects of the previous model, Switching Convolutional Neural Network (Switch-CNN) select the count as the output of one of the regressors. Multi-Column Convolutional Neural Network (MCNN) combines the count and the weight of the Regressors by fixed weights from MCNN, while our approach is adapted for individual images. Our experiments have shown that our method outperform Switch-CNN, MCNN on ShanghaiTech dataset and UCF_CC_50 dataset.

Full Text