Abstract

Few-Shot Action Recognition (FSAR) aims at recognizing novel action classes with only a few labelled samples. Due to its simplicity and effectiveness, the prototypical network has attracted increasing interest in the field of FSAR. It is key to learn representative prototypes from a few labelled videos with various action lengths and speeds. To address this issue, this paper presents a dual-prototype network that combines class-specific and query-specific attentive learning for FSAR. First, we propose a class-specific attentive learning method that computes the within-class similarity for each class of the support sample. This method not only increases the representativeness of prototypes but also mitigates the impact of noises and outlying samples. Second, the class-specific attention is combined with query-specific attention to establish two parallel sets of prototypes for FSAR. The incorporation of query-specific attention further increases the discrimination among prototypes for different query samples. Furthermore, we propose a temporal-relation model to express the temporal dependency of action videos with different lengths and speeds. The proposed method is validated on four benchmark datasets. Extensive experimental results demonstrate the superiority of our method to the other 11 state-of-the-art FSAR methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call