Abstract

Crowd counting and density estimation is an important and challenging problem in the visual analysis of the crowd. Most of the existing approaches use regression on density maps for the crowd count from a single image. However, these methods cannot localize individual pedestrian and therefore cannot estimate the actual distribution of pedestrians in the environment. On the other hand, detection-based methods detect and localize pedestrians in the scene, but the performance of these methods degrades when applied in high-density situations. To overcome the limitations of pedestrian detectors, we proposed a motion-guided filter (MGF) that exploits spatial and temporal information between consecutive frames of the video to recover missed detections. Our framework is based on the deep convolution neural network (DCNN) for crowd counting in the low-to-medium density videos. We employ various state-of-the-art network architectures, namely, Visual Geometry Group (VGG16), Zeiler and Fergus (ZF), and VGGM in the framework of a region-based DCNN for detecting pedestrians. After pedestrian detection, the proposed motion guided filter is employed. We evaluate the performance of our approach on three publicly available datasets. The experimental results demonstrate the effectiveness of our approach, which significantly improves the performance of the state-of-the-art detectors.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.