Crowd counting aims to estimate the number, density, and distribution of crowds in an image. While CNN-based crowd counting methods have been effective, head-scale variation and complex background remain two major challenges for crowd counting. Therefore, we propose a multiscale region calibration network called MRCNet to effectively address these challenges. To address the former challenge, we design a multiscale aware module that utilizes multi-branch dilated convolutional parallelism to obtain multiscale receptive fields and cope with drastic changes in head size. For the latter challenge, we design a regional calibration module that calibrates the attention weights of each region after obtaining the attention map to effectively handle challenges in complex contexts. Additionally, we improve the loss function by combining L2 loss and binary cross-entropy loss to help MRCNet achieve excellent results. Extensive experiments were conducted on three mainstream datasets to demonstrate the robustness and competitiveness of our approach.
Read full abstract