Abstract

This paper addresses the problem of recognizing human actions from depth videos. We propose a depth-based local descriptor and affine subspace coding representation with locality-constrained affine subspace coding (LASC) for 3D action recognition. First, each depth video sequence is divided into a set of subsequences (i.e., multi-scale sub-actions) based on the normalized motion energy vector. Next, depth motion map-based gradient local auto-correlation features are employed to capture the shape information and motion cues of each sub-action. In order to obtain discriminative and compact representation, we extract the local high-order information of the depth video using LASC. Through experiments, we show that the use of LASC exhibits better performance compared with existing methods such as locality-constrained linear coding. We compared LASC with the state-of-the-art methods based on similar principle, using features extracted from a single modality, on four datasets, and with those using multiple features or nonlinear recognition machines. The results on four datasets clearly show the effectiveness of the proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call