Abstract

In this paper, we propose a novel method for human action recognition based on sparse coding with a pyramid matching. Spatio-temporal interest points (STIPs) are firstly detected by a newly developed detector named spatio-temporal steerable detector (STSD). To effectively capture the distribution of STIPs in the video sequence, we propose to project the STIPs onto the three orthogonal planes (TOP), and we employ a sparse coding algorithm combined with the spatial pyramid matching to encode the layout of STIPs. Therefore the structure of an action are sufficiently encoded, obtaining a informative holistic descriptor for action representation. Extensive experiments have been conducted on KTH and HMDB51 datasets. Our method achieves the state-of-the-art performance for action recognition showing the effectiveness of the proposed methods for human action representation.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.