Abstract

Recent years have seen a growth in interest in skeleton-based human behavior recognition. Skeleton sequences can be expressed naturally as high-order tensor time series, and in this paper we report on the modeling and analysis of such time series using a linear dynamical system (LDS). Owing to their relative simplicity and efficiency, LDSs are the most common tool used in various disciplines for encoding spatiotemporal time series data. However, conventional LDSs process the latent and observed states at each frame of a video as a column vector, a representation that fails to take into account valuable structural information associated with human action. To correct this, we propose a tensor-based linear dynamical system (tLDS) for modeling tensor observations in time series and employ Tucker decomposition to estimate the parameters of the LDS model as action descriptors. In this manner, an action can be expressed as a subspace corresponding to a point on a Grassmann manifold on which classification can be performed using dictionary learning and sparse coding. Experiments using the MSR Action3D, UCF Kinect, and Northwestern-UCLA Multiview Action3D datasets demonstrate the excellent performance of our proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call