Abstract
Crowd counting has become a noteworthy vision task due to the needs of numerous practical applications, but it remains challenging. State-of-the-art methods generally estimate the density map of the crowd image with the high-level semantic features of various deep convolutional networks. However, the absence of low-level spatial information may result in counting errors in the local details of the density map. To this end, a novel framework named Multi-level Feature Fusion Network (MFFN) for single image crowd counting is proposed. The proposed MFFN, which is constructed in an encoder–decoder fashion, incorporates semantic and spatial information for generating high-resolution density maps of input crowd images. Skip connections are developed between the encoder and the decoder so that low-level spatial information and high-level semantic features can be combined by element-wise addition. In addition, a dense dilated convolution block is placed behind the encoder, extracting multi-scale context features to guide feature fusion by a channel attention mechanism. The model is trained by multi-task learning; semantic segmentation supervision is introduced to enhance feature representation. Extensive experiments are conducted on three crowd counting datasets (ShanghaiTech, UCF_CC_50, UCF-QNRF), and the results show that MFFN outperforms state-of-the-art methods. In addition, sufficient ablation studies are performed to verify the effectiveness of each component in our proposed method.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.