Abstract

AbstractThis paper found out the problem and suggested a solution for crowd estimation of real-life images. In general, the camera position is fixed at the public places and capture the top view of the images. Most of the recent neural networks are developed with these images. Recently, the CSRNet model was developed for the ShanghaiTech dataset. This model achieved better accuracy than state-of-the-art methods. It is difficult to capture the top view of images where the crowd is gathered at random such as strike, and riot. Therefore, we capture both the top and the front view of images to deal with such circumstances. In this work, the CSRNet model is evaluated using two different test cases consisting of either only top view images or front view images. The mean absolute error (MAE) and mean squared error (MSE) values of the front view images are higher than the top view images. The relative MAE and MSE of the CSRNet model for the front view images are 28.64 and 47.86%, respectively, higher than the top view images. It is noted that higher MAE and MSE means lower performance. This issue can be resolved using the suggested GANN network, which can project the front view images into the top view images. After that, these images can be evaluated using the CSRNet model.KeywordsCrowd estimationCSRNetDilated CNNGradient adversarial neural network (GANN)Top viewFront view

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call