Abstract

Fine-grained action recognition involves comparison of similar actions of variable-length size consisting of subtle interactions between human and specific objects. Hence, we propose a dynamic kernel-based approach to handle the variable-length patterns for effective recognition of fine-grained actions. Initially, we extract local spatio-temporal features for each video to capture appearance and motion information effectively. An action-independent Gaussian mixture model (AIGMM) is trained on the extracted features of all fine-grained actions to analyze spatio-temporal information and preserve the local similarities among fine-grained actions. Then, the statistics of AIGMM, namely, mean, covariance, and posteriors are used to build the kernels for finding the similarity between any two fine-grained actions by mapping statistics to kernel feature space. We demonstrate the effectiveness of proposed approach using three dynamic kernels i.e., GMM mean interval kernel, supervector kernel, intermediate matching kernel on four varieties of fine-grained action datasets, namely, MERL, JIGSAWS, KSCGR, and MPII cooking2

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call