Abstract

In the COVID pandemic situation, crowd counting became one of the tools to monitor if the social-distancing norms are being followed or not. However, in designing crowd counting algorithm, there are several challenges such as background noise, camera-to-objects distance, occlusion, and variations due to illumination, scale, and viewpoint. In this research, we propose a novel pipeline for density estimation in crowd counting. The proposed pipeline makes use of an encoder-decoder-based architecture in which we explore the family of EfficientN ets for the encoder architecture. For the decoder, we propose a deeper attention network to assist the model in a better distinction between foreground and background pixels. We empirically show that for a crowd counting dataset, the use of average pooling operation for any backbone architecture of encoder gives a significant improvement in performance. In terms of Mean Absolute Error, the proposed pipeline outperforms existing state-of-the-art techniques by a large margin on large-scale and small-scale counting datasets, UCF-QNRF and UCF _CC_50 dataset. We also achieve state-of-the-art results on the ShanghaiTech and Mall datasets. We additionally propose a crowd counting dataset captured using drones. We perform benchmark experiments on this dataset with existing and the proposed methods. The proposed dataset can be found at http://www.iab-rubric.org/resources/CrowdUAV.html.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.