MHAMD-MST-CNN: multiscale head attention guided multiscale density maps fusion for video crowd counting via multi-attention spatial-temporal CNN

Santosh Kumar Tripathy,Subodh Srivastava,Rajeev Srivastava

doi:10.1080/21681163.2023.2188971

Santosh Kumar Tripathy, Subodh Srivastava + Show 1 more

https://doi.org/10.1080/21681163.2023.2188971

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

ABSTRACT Video-based crowd counting and density estimation (CCDE) is vital for crowd monitoring. The existing solutions lack in addressing issues like cluttered background and scale variation in crowd videos. To this end, a multiscale head attention-guided multiscale density maps fusion for video-based CCDE via multi-attention Spatial-Temporal CNN (MHAMD-MST-CNN) is proposed. The MHAMD-MST-CNN has three modules: a multi attention spatial stream (MASS), a multi attention temporal stream (MATS), and a final density map generation (FDMG) module. The spatial head attention modules (SHAMs) and temporal head attention modules (THAMs) are designed to eliminate the background influence from the MASS and the MATS, respectively, by mapping the multiscale spatial or temporal features to head maps. The multiscale de-backgrounded features are utilised by the density map generation (DMG) modules to generate multiscale density maps to deal with scale variation due to perspective distortion. The multiscale density maps are fused and fed into the FDMG module to obtain the final crowd density map. The MHAMD-MST-CNN has been trained and validated on three publicly available benchmark datasets: the Venice, the Mall, and the UCSD. The MHAMD-MST-CNN provides competitive results as compared with the state-of-the-arts in terms of mean absolute error (MAE) and root mean squared error (RMSE).

Full Text

Published Version

Check institute access

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

MHAMD-MST-CNN: multiscale head attention guided multiscale density maps fusion for video crowd counting via multi-attention spatial-temporal CNN

Abstract

Published Version

Talk to us

Similar Papers

More From: Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization

Lead the way for us

Journal: Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization	Publication Date: Mar 25, 2023
Citations: 1

Similar Papers

Crowd counting with crowd attention convolutional neural network
Jiwei Chen ... Zengfu Wang
Neurocomputing | VOL. 382
Jiwei Chen, et. al.Jiwei Chen ... Zengfu Wang
04 Dec 2019
Neurocomputing | VOL. 382

A survey of crowd counting and density estimation based on convolutional neural network
Zizhu Fan ... Yaowei Wang
Neurocomputing | VOL. 472
Zizhu Fan, et. al.Zizhu Fan ... Yaowei Wang
08 Nov 2021
Neurocomputing | VOL. 472

A survey of recent advances in CNN-based single image crowd counting and density estimation
Vishwanath A Sindagi ... Vishal M Patel
Pattern Recognition Letters | VOL. 107
Vishwanath A Sindagi, et. al.Vishwanath A Sindagi ... Vishal M Patel
17 Jul 2017
Pattern Recognition Letters | VOL. 107

Multi-scale features fused network with multi-level supervised path for crowd counting
Yongjie Wang ... Jianghua Zhu
Expert Systems with Applications | VOL. 200
Yongjie Wang, et. al.Yongjie Wang ... Jianghua Zhu
29 Mar 2022
Expert Systems with Applications | VOL. 200

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

MHAMD-MST-CNN: multiscale head attention guided multiscale density maps fusion for video crowd counting via multi-attention spatial-temporal CNN

Abstract

Published Version

Talk to us

Similar Papers

More From: Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization