Abstract

A dense crowd counting method based on self-attention mechanism with dual-branch fusion network is proposed in this paper. Our method aims to address the problems of large variations in head scales and complex backgrounds in dense crowd images. This method combines the CNN and Transformer network frameworks and consists of shallow feature extraction network, dual-branch fusion network, and deep feature extraction network. The VGG16 network is employed by the shallow feature extraction network to extract low-level features. A multi-scale CNN branch and a Transformer branch built on an improved self-attention module make up the dual-branch fusion network, which collects local and global information on crowd areas, respectively. The Transformer network, which is based on a mixed attention module, is employed by the deep feature extraction network to further separate complicated backgrounds and concentrate on crowd areas. Both counting-level weakly supervised and location-level fully supervised methods are employed in the experiments. On four widely used datasets, the results demonstrate that the proposed method outperforms the most recent research. Our method has a higher counting accuracy with low parameter volumes and a counting accuracy of 89.1% under full supervision when compared to existing weakly supervised methods. The results of the experiments demonstrate that the method has excellent crowd counting performance and can accurately count in high-density and high-occlusion scenes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.