EPK-CLIP: External and Priori Knowledge CLIP for action recognition

Zhaoqilin Yang,Gaoyun An,Zhenxing Zheng,Shan Cao,Fengjuan Wang

doi:10.1016/j.eswa.2024.124183

Abstract

Contrastive Language-Image Pretraining (CLIP) models have achieved significant success and have markedly improved the performance of various downstream tasks, including action recognition. However, how to effectively introduce knowledge into the field of action recognition remains an open question. In this work, an External and Priori Knowledge CLIP (EPK-CLIP) is proposed to introduce external knowledge into the model. To capture external knowledge, an external knowledge embedding module is proposed, which can generate and utilize human-object interaction relations as external knowledge, enabling the model to learn better features. Furthermore, the sparse regularization is introduced in the loss function, endowing the model with the ability to exploit the sparse priori knowledge inherent in the classification task. Finally, multiple inference module is proposed to obtain classification results from both direct and indirect perspectives. Specifically, the final classification result is obtained by fusing the outputs of different reasoning modules. Moreover, four external knowledge datasets: Kinetics-400-VC, Jester-VC, HMDB-51-VC, and UCF-101-VC are built and released for public usage, which is a multimodal extension of corresponding action datasets respectively. Under fully-supervised settings, our model achieves the top-1 accuracy of 84.3%, 97.1%, 82.9%, and 98.2% on Kinetics-400, Jester, HMDB-51, and UCF-101, respectively. In zero-shot experiments, our model also achieves state-of-the-art results, with top-1 accuracy of 51.6% and 77.7% on HMDB-51 and UCF-101, respectively. All related datasets and code can be found at https://github.com/geek12138/EPK-CLIP.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

EPK-CLIP: External and Priori Knowledge CLIP for action recognition

Abstract

Talk to us

Similar Papers

More From: Expert Systems With Applications

Lead the way for us

Similar Papers

Knowledge Enhanced BERT Based on Corpus Associate Generation
Lu Jiarong ... Jiang Wenchao
-
Lu Jiarong, et. al.Lu Jiarong ... Jiang Wenchao
01 Jan 2023
01 Jan 2023

Enhancing Multiple-Choice Question Answering with Causal Knowledge
Dhairya Dalal ... Paul Buitelaar
-
Dhairya Dalal, et. al.Dhairya Dalal ... Paul Buitelaar
01 Jan 2020
01 Jan 2020

A framework for mobile activity recognition
Jiahui Wen
-
Jiahui WenJiahui Wen
22 May 2017
22 May 2017

Analysis and Improvement of External Knowledge Usage in Machine Multi-Choice Reading Comprehension Tasks
Yichuan Jiang ... Heyan Huang
-
Yichuan Jiang, et. al.Yichuan Jiang ... Heyan Huang
01 Oct 2020
01 Oct 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

EPK-CLIP: External and Priori Knowledge CLIP for action recognition

Abstract

Talk to us

Similar Papers

More From: Expert Systems With Applications