Mixture of counting CNNs

Shohei Kumagai,Takio Kurita,Kazuhiro Hotta

doi:10.1007/s00138-018-0955-6

Abstract

This paper proposes a crowd counting method. Crowd counting is difficult because of significant appearance changes of a target which caused by density and scale changes. Conventional crowd counting methods commonly utilize one predictor (e.g., regression and multi-class classifier). However, such only one predictor can not count targets with significant appearance changes well. In this paper, we propose to predict the number of targets using multiple convolutional neural networks (CNNs) specialized to a specific appearance, and those CNNs are adaptively selected according to the appearance of a test image. By integrating the selected CNNs, the proposed method has the robustness to large appearance changes. In experiments, we confirm that the proposed method can count crowd with lower counting error than VGGNet, integration of CNNs with fixed weights and conventional counting methods. Moreover, we confirm that each CNN automatically specialized to a specific appearance (e.g., dense region and sparse region) of crowd through training of CNNs.

Full Text