Abstract
Human action recognition from realistic videos attracts more attention in many practical applications such as on-line video surveillance and content-based video management. Single action recognition always fails to distinguish similar action categories due to the complex background settings in realistic videos. In this paper, a novel action-scene model is explored to learn contextual relationship between actions and scenes in realistic videos. With little prior knowledge on scene categories, a generative probabilistic framework is used for action inference from background directly based on visual words. Experimental results on a realistic video dataset validate the effectiveness of the action-scene model for action recognition from background settings. Extensive experiments were conducted on different feature extracted methods, and the results show the learned model has good robustness when the features are noisy.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have