Abstract

Cross-view action recognition plays a pivotal role in some occasions where multiple cameras are required for monitoring. However, it has not performed well yet. One of the main challenges many researchers face is the divergence in viewpoints. This paper focuses on a semisupervised cross-domain feature construction and alignment method for action recognition from cross-view cameras. The proposed framework has four significant steps: First, shape context features are extracted, integrated, and encoded as the pose-level feature representation on a low-dimensional space. Second, key poses are obtained by clustering the pose features and considering the sequence-level feature representation. In the next step, the integrated sequence-level features from different view domains are matched via nonlinear kernelized manifold alignment and brought into a shared space by establishing two different projection matrices to learn a transferable model for cross-view action representation. In the final step, a linear SVM classifier is learned to label the unseen actions from different view domains. Experimental results over benchmark datasets report recognition rates of 90%, 90.91%, and 92.95% on INRIA Xmas Motion Acquisition Sequences, Northwestern-UCLA Multi-View 3D Actions, and multi-view action recognition dataset datasets, respectively, which show that our method achieves superior performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call