Abstract
Nowadays, due to various challenges such as large-scale variation of population, mutual occlusion, perspective distortion and so on, crowd counting has gradually become a hot issue in computer vision. To address the large- scale variation exists in the images, in this paper, we propose a novel multi-scale network called MSNet which aims to maintain continuous variations and count the number of pedestrians accurately. While most state-of-the- arts multi-scale and multi-column networks aim to integrate the scale information of heads with different size, lots of researches still need to do to achieve continuous variations. In MSNet, specifically, the first ten layers of the visual geometry group network(VGG) are used as the backbone to extract the rough features of images and a multi-scale block is employed to maintain the scale information which contains several receptive kernels to obtain a better performance towards the difficulty of scale-variation. Inspired by the knowledge that using multiple small receptive field kernels to replace a single large receptive field will get a better performance, we utilize two dilated convolutions with the receptive field of 5 to replace the large kernel. Our MSNet has moderate increase in computation, and we evaluate our method on three benchmark datasets including ShanghaiTech (Part A: MAE=59.6, RMSE=96.1; Part B: MAE=7.5, RMSE=12.1), UCF-CC-50(MAE=207.9, RMSE=273.8) and UCF-QNRF(MAE=93, RMSE=158) to show the outperformance of our method.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.