Abstract

We propose a strategy that focuses on estimating the number of people in a crowd, one of the aims of crowd analysis, using static images or video images. While manual feature extraction was not performed with pixel and regression-based methods in the first studies on crowd analysis, recent studies use Convolutional Neural Networks (CNN) based models. However, it is still difficult to extract spatial information such as position, orientation, posture, and angular value for crowd estimation from a density map. This study uses capsule networks and routing by agreement algorithm as an attention module. Our proposed approach consists of both CNN and capsule network-based attention modules in a two-column deep neural network architecture. We evaluate our proposed approach compared with other state-of-the-art methods using three well-known datasets: UCF-QNRF, UCF_CC_50, UCSD, ShangaiTech Part A, and WorldExpo’10.

Highlights

  • Population growth and rapid urbanization gather people together and require planning in order to prevent crowd congestion

  • In the light of these observations, our study proposes an attention-based model which uses the ability of the twocolumn Convolutional Neural Networks (CNN) to learn useful features and uses the spatial information acquisition feature of the Capsule Networks (CapsNet)

  • Object recognition-based CNN approaches to crowd counting constitute an important approach that has far exceeded the success of pixel and color-based approaches in the past, they have unresolved problems in taking on spatial information that varies according to scale

Read more

Summary

INTRODUCTION

Population growth and rapid urbanization gather people together and require planning in order to prevent crowd congestion. In the light of these observations, our study proposes an attention-based model which uses the ability of the twocolumn CNNs to learn useful features and uses the spatial information acquisition feature of the Capsule Networks (CapsNet). Bolat: Crowd Density Estimation by Using Attention Based Capsule Network and Multi-Column CNN module, is examined on various datasets and shows that the results are comparable with current studies in the literature. Object recognition-based CNN approaches to crowd counting constitute an important approach that has far exceeded the success of pixel and color-based approaches in the past, they have unresolved problems in taking on spatial information that varies according to scale. The unique orientations, postures, and angular values of individuals and groups in crowded images are important in predicting crowd density estimation Using these spatial features, crowd behavior analysis may be the subject of future studies.

RELATED WORK
CAPSULE NETWORK
EXPERIMENTAL RESULTS
Findings
CONCLUSION AND DISCUSSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call