Recognizing actions via sparse coding on structure projection

Lei Zhang,Tao Wang,Xiantong Zhen

doi:10.1109/icip.2013.6738497

Abstract

In this paper, we propose a novel method for human action recognition based on sparse coding with a pyramid matching. Spatio-temporal interest points (STIPs) are firstly detected by a newly developed detector named spatio-temporal steerable detector (STSD). To effectively capture the distribution of STIPs in the video sequence, we propose to project the STIPs onto the three orthogonal planes (TOP), and we employ a sparse coding algorithm combined with the spatial pyramid matching to encode the layout of STIPs. Therefore the structure of an action are sufficiently encoded, obtaining a informative holistic descriptor for action representation. Extensive experiments have been conducted on KTH and HMDB51 datasets. Our method achieves the state-of-the-art performance for action recognition showing the effectiveness of the proposed methods for human action representation.

Full Text