Abstract

Pedestrian density estimation is one of the key problems in intelligent transportation systems and has been widely applied to a number of applications in other fields of engineering. Counting-by-regression methods are more favorable for coping with such a problem owing to their robustness against interperson occlusion and relaxing the impractical requirement of a high video frame rate, compared to counting-by-detection and counting-by-clustering methods. However, imagery features in the existing counting-by-regression approaches are extracted from the whole region or spatially localized cells/pixels of each single video frame, which omits the unique motion patterns of the same pedestrians across the neighboring frames. In the light of this, this paper exploits a novel tensor-formed spatiotemporal feature representation and applies it in a multilinear regression learning framework, which can capture spatially distributed dynamic crowd patterns by discovering the latent multidimensional structural correlations of tensor features along both spatial (i.e., horizontal and vertical) and temporal dimensions. Extensive evaluation with the public UCSD and Shopping Mall benchmarks demonstrate superior performance of our approach to the state-of-the-art counting methods even when the surveillance data has a low frame rate.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.