Abstract

At present, the research on human action recognition has achieved remarkable results and is widely used in various industries. Among them, human action recognition based on deep learning has developed rapidly. With sufficient labeled data, supervised learning methods can achieve satisfactory recognition performance. However, the diversification of motion types and the complexity of the video background make the annotation of human motion videos a lot of labor costs. This severely restricts the application of supervised human action recognition methods in practical scenarios. Since the zero-shot learning method can realize the recognition of unseen action categories without relying on a large amount of labeled data. In recent years, action recognition methods based on zero-shot learning have received great attention from researchers. In this paper, we propose an attention-based zero-shot action recognition model ADZSAR. We design a novel attention-based mechanism feature extraction method that introduces the current state-of-the-art semantic embedding model (Word2Vec). Experiments show that this method performs the best among similar zero-shot action recognition methods based on spatio-temporal features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call