Action Recognition Using Multilevel Features and Latent Structural SVM

Xinxiao Wu,Dong Xu,Jiebo Luo,Lixin Duan,Yunde Jia

doi:10.1109/tcsvt.2013.2244794

Abstract

We first propose a new low-level visual feature, called spatio-temporal context distribution feature of interest points, to describe human actions. Each action video is expressed as a set of relative XYT coordinates between pairwise interest points in a local region. We learn a global Gaussian mixture model (GMM) (referred to as a universal background model) using the relative coordinate features from all the training videos, and then we represent each video as the normalized parameters of a video-specific GMM adapted from the global GMM. In order to capture the spatio-temporal relationships at different levels, multiple GMMs are utilized to describe the context distributions of interest points over multiscale local regions. Motivated by the observation that some actions share similar motion patterns, we additionally propose a novel mid-level class correlation feature to capture the semantic correlations between different action classes. Each input action video is represented by a set of decision values obtained from the pre-learned classifiers of all the action classes, with each decision value measuring the likelihood that the input video belongs to the corresponding action class. Moreover, human actions are often associated with some specific natural environments and also exhibit high correlation with particular scene classes. It is therefore beneficial to utilize the contextual scene information for action recognition. In this paper, we build the high-level co-occurrence relationship between action classes and scene classes to discover the mutual contextual constraints between action and scene. By treating the scene class label as a latent variable, we propose to use the latent structural SVM (LSSVM) model to jointly capture the compatibility between multilevel action features (e.g., low-level visual context distribution feature and the corresponding mid-level class correlation feature) and action classes, the compatibility between multilevel scene features (i.e., SIFT feature and the corresponding class correlation feature) and scene classes, and the contextual relationship between action classes and scene classes. Extensive experiments on UCF Sports, YouTube and UCF50 datasets demonstrate the effectiveness of the proposed multilevel features and action-scene interaction based LSSVM model for human action recognition. Moreover, our method generally achieves higher recognition accuracy than other state-of-the-art methods on these datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Action Recognition Using Multilevel Features and Latent Structural SVM

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Circuits and Systems for Video Technology

Lead the way for us

Journal: IEEE Transactions on Circuits and Systems for Video Technology	Publication Date: Aug 1, 2013
Citations: 44

Similar Papers

Action Recognition using Spatial and Temporal Features with Kernel SVM
Nivetha N
International Journal For Multidisciplinary Research | VOL. 5
Nivetha N Nivetha N
31 Jul 2023
International Journal For Multidisciplinary Research | VOL. 5

Recognizing human actions by attributes
Jingen Liu ... Benjamin Kuipers
-
Jingen Liu, et. al.Jingen Liu ... Benjamin Kuipers
01 Jun 2011
01 Jun 2011

Learning a Deep Model for Human Action Recognition from Novel Viewpoints.
Hossein Rahmani ... Mubarak Shah
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 40
Hossein Rahmani, et. al.Hossein Rahmani ... Mubarak Shah
06 Apr 2017
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 40

A Generative Approach to Zero-Shot and Few-Shot Action Recognition
Ashish Mishra ... Anurag Mittal
-
Ashish Mishra, et. al.Ashish Mishra ... Anurag Mittal
01 Mar 2018
01 Mar 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Action Recognition Using Multilevel Features and Latent Structural SVM

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Circuits and Systems for Video Technology