Knowledge-Constrained Answer Generation for Open-Ended Video Question Answering

Yao Jin,Xi Peng,Jun Yu,Guocheng Niu,Xinyan Xiao,Jian Zhang

doi:10.1609/aaai.v37i7.25983

Abstract

Open-ended Video question answering (open-ended VideoQA) aims to understand video content and question semantics to generate the correct answers. Most of the best performing models define the problem as a discriminative task of multi-label classification. In real-world scenarios, however, it is difficult to define a candidate set that includes all possible answers. In this paper, we propose a Knowledge-constrained Generative VideoQA Algorithm (KcGA) with an encoder-decoder pipeline, which enables out-of-domain answer generation through an adaptive external knowledge module and a multi-stream information control mechanism. We use ClipBERT to extract the video-question features, extract framewise object-level external knowledge from a commonsense knowledge base and compute the contextual-aware episode memory units via an attention based GRU to form the external knowledge features, and exploit multi-stream information control mechanism to fuse video-question and external knowledge features such that the semantic complementation and alignment are well achieved. We evaluate our model on two open-ended benchmark datasets to demonstrate that we can effectively and robustly generate high-quality answers without restrictions of training data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Knowledge-Constrained Answer Generation for Open-Ended Video Question Answering

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Jun 26, 2023
Citations: 2

Similar Papers

EPK-CLIP: External and Priori Knowledge CLIP for action recognition
Zhaoqilin Yang ... Fengjuan Wang
Expert Systems With Applications | VOL. 252
Zhaoqilin Yang, et. al.Zhaoqilin Yang ... Fengjuan Wang
10 May 2024
Expert Systems With Applications | VOL. 252

Improving Empathetic Dialogue Generation by Dynamically Infusing Commonsense Knowledge
...
arXiv (Cornell University) | VOL. -
, et. al. ...
24 May 2023
arXiv (Cornell University) | VOL. -

An implicit aspect-based sentiment analysis method using supervised contrastive learning and knowledge embedding
Junsen Fu ... Shumin Wang
Applied Soft Computing | VOL. 167
Junsen Fu, et. al.Junsen Fu ... Shumin Wang
18 Sep 2024
Applied Soft Computing | VOL. 167

Unifying the Video and Question Attentions for Open-Ended Video Question Answering.
Hongyang Xue ... Deng Cai
IEEE Transactions on Image Processing | VOL. 26
Hongyang Xue, et. al.Hongyang Xue ... Deng Cai
29 Aug 2017
IEEE Transactions on Image Processing | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Knowledge-Constrained Answer Generation for Open-Ended Video Question Answering

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence