Abstract

Despite great effort, current action recognition systems still do not perform well enough for many applications. Majority of existing methods utilise some (combinations of) image descriptors, which have fundamental limitations that they can vary significantly from person to person and are sensitive to changes of illumination, clothing etc. Such methods thus often require a large number of training examples to cover as much of variation as possible. In this paper, we propose to extract minimal representative information, namely deforming skeleton graphs corresponding to foreground shapes, to effectively represent actions and remove the influence of these typical variations. We propose a novel approach to action recognition based on matching of skeleton graphs, combining static pairwise graph similarity measure using optimal subsequence bijection with dynamic time warping to robustly handle topological and temporal variations. For common periodic actions, we extract a consistent starting frame from each video to temporally align deforming skeleton graphs. We further develop a hierarchical matching strategy to significantly improve matching efficiency while keeping recognition accuracy. Our method outperforms the state-of-the-art methods on standard benchmarks KTH (97.7%), UCF sport (92.3%) and Olympic sports (80.5%) datasets.Despite great effort, current action recognition systems still do not perform well enough for many applications. Majority of existing methods utilise some (combinations of) image descriptors, which have fundamental limitations that they can vary significantly from person to person and are sensitive to changes of illumination, clothing etc. Such methods thus often require a large number of training examples to cover as much of variation as possible. In this paper, we propose to extract minimal representative information, namely deforming skeleton graphs corresponding to foreground shapes, to effectively represent actions and remove the influence of these typical variations. We propose a novel approach to action recognition based on matching of skeleton graphs, combining static pairwise graph similarity measure using optimal subsequence bijection with dynamic time warping to robustly handle topological and temporal variations. For common periodic actions, we extract a consistent starting frame from each vid...

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call