Abstract
Gait recognition and understanding systems have shown a wide-ranging application prospect. However, their use of unstructured data from image and video has affected their performance, e.g., they are easily influenced by multi-views, occlusion, clothes, and object carrying conditions. This paper addresses these problems using a realistic 3-dimensional (3D) human structural data and sequential pattern learning framework with top-down attention modulating mechanism based on Hierarchical Temporal Memory (HTM). First, an accurate 2-dimensional (2D) to 3D human body pose and shape semantic parameters estimation method is proposed, which exploits the advantages of an instance-level body parsing model and a virtual dressing method. Second, by using gait semantic folding, the estimated body parameters are encoded using a sparse 2D matrix to construct the structural gait semantic image. In order to achieve time-based gait recognition, an HTM Network is constructed to obtain the sequence-level gait sparse distribution representations (SL-GSDRs). A top-down attention mechanism is introduced to deal with various conditions including multi-views by refining the SL-GSDRs, according to prior knowledge. The proposed gait learning model not only aids gait recognition tasks to overcome the difficulties in real application scenarios but also provides the structured gait semantic images for visual cognition. Experimental analyses on CMU MoBo, CASIA B, TUM-IITKGP, and KY4D datasets show a significant performance gain in terms of accuracy and robustness.
Highlights
The development and application of intelligent surveillance technology have led to high demands for social security
A novel gait recognition method based on gait semantic folding and the Hierarchical Temporal Memory (HTM)
Network a novel gait based on gait semantic folding and the HTM
Summary
The development and application of intelligent surveillance technology have led to high demands for social security. Traditional data dimensionality reduction methods can address the above problems to a certain extent, but the effect of dimensionality reduction often depends on the number of specific samples and application scenarios Their generalization is weak, and the data after dimensionality reduction is difficult to understand, i.e., they are usually considered a ‘black box’ without semantic information. Another problem is that, as a sequence of actions, gait, and its behaviour are better analyzed using spatial-temporal models rather than static images [7]. In order to take advantages of HTM, our method abstracts the raw gait images into a high-level semantic description.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.