Abstract

In cross-view action recognition, there remains a challenge that the action representation will lack the ability of transfer learning when the feature space changes. To solve this problem, a cross-view action recognition approach using a bilayer discriminative model is proposed. We first extract the key poses to capture the essence of each action sequence and represent each key pose by a bag of visual words (BoVW) in a single view. We then construct a bipartite graph between the heterogeneous poses and apply multipartitioning to cocluster the view-dependent visual words for developing the cross view bags of visual words feature, which is more discriminative in the presence of view changes. The novelty is to design a bilayer classifier consisting of SVM and HMM at the frame level and sequence level, respectively, to make up for the loss of temporal information when using a BoVW to represent the whole action sequence. Finally, DTW is used as a pruning algorithm to lessen the number of nodes for searching the Viterbi path. Extensive experiments are performed on two well-known multiple view action datasets IXMAS and N-UCLA, and a detailed performance comparison with the existing view-invariant action recognition techniques indicates that the proposed method works equally well in accuracy and efficiency.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call