Abstract

Recognizing Human action in video is an very active research topic. There are a growing variety of human action datasets with different video length, different practitioners. Make human action recognition becomes a very difficult topic. A majority researchers solve the problem by extracting key frames from the videos. Most paper use feature Clustering methods to extract key frames in videos. On one hand, the large variety of visual content in videos make handcraft feature isn't effective enough, since there are no fixed descriptors can describe all video cases. On the other hand, traditional clustering algorithms are easily influenced by the choice of initial clustering centers. An Unsupervised feature learning and clustering method for key frame extraction is proposed in this paper, which can be used for human action recognition. Stacked auto-encoder(SAE) is trained using videos from 10 different human actions, SAE is used as a feature extractor to learn features representing human actions. Affinity Propagation Clustering algorithm is used to select key frames from video sequences. We use a variety of videos to do the experiments. Experiments demonstrate that our method can be effectively summarizing video shots considering different human actions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call