Abstract
Crowd counting, which requires to estimate crowd density from an image, is still a challenging task in computer vision. Most of the current methods are focused on large scale variation of people and ignore the huge distribution difference of crowd. To tackle these two problems together, we propose a novel framework named Spatial Normalization Network (SNNet). We normalize multi-scale features from parallel subnetworks to a particular scale and then fuse them to acquire rich spatial information for final accurate density map predictions. Furthermore, we propose a novel normalization layer called Spatial Group Normalization (SGN), which firstly split feature maps along the spatial dimension and then perform group-wise normalization. It’s useful to solve statistic shift problems caused by the great difference of distribution in crowd counting. Moreover, SGN can be naturally plugged into existing solutions and brings significant improvement in crowd counting. Our proposed SNNet achieves state-of-the-art performance on four challenging crowd counting datasets (ShanghaiTech, UCFQNRF, GCC and TRANCOS datasets), which demonstrates the effectiveness and robust feature learning capability of our methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.