Learning topic emotion and logical semantic for video paragraph captioning

Qinyu Li,Hanli Wang,Xiaokai Yi

doi:10.1016/j.displa.2024.102706

Abstract

Video paragraph captioning aims to generate multiple descriptive sentences for videos, which strive to replicate human writing in accuracy, logicality, and richness. However, current research focuses on the accuracy and temporal order of events, ignoring emotion and other critical logical relations embedded in human language, such as causal and adversative relations. The ignorance impairs the reasonable transition across generated event descriptions and restricts the vividness of expression, resulting in a gap from the standard of human language. To resolve the problem, a framework that integrates logic and emotion representation learning is proposed to narrow the gap. Concretely, a large-scale inter-event relation corpus is constructed based on the EMVPC dataset. This corpus is named EMVPC-EvtRel (standing for “EMVPC-Event Relations”) and contains six widely-used logical relations in human writing, 127 explicit inter-sentence connectives, and over 20,000 pairs of event segments with newly annotated logical relations. A logical semantic representation learning method is developed for recognizing the dependencies between visual events, thereby enhancing the characteristics of video contents and boosting the logicality of generated paragraphs. Moreover, a fine-grained emotion recognition module is designed to uncover emotion features embedded in videos. Finally, experimental results on the EMVPC dataset demonstrate the superiority of the proposed method compared to existing state-of-the-art approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Learning topic emotion and logical semantic for video paragraph captioning

Abstract

Talk to us

Similar Papers

More From: Displays

Lead the way for us

Similar Papers

A neural network structure specified for representing and storing logical relations
Gang Wang
Neural Computing and Applications | VOL. 32
Gang WangGang Wang
25 Jul 2020
Neural Computing and Applications | VOL. 32

Logical necessity and sufficiency in medicine
David William Young
Mathematical Biosciences | VOL. 21
David William YoungDavid William Young
01 Dec 1974
Mathematical Biosciences | VOL. 21

Proceedings of the Workshop on Annotating and Reasoning about Time and Events - ARTE '06
-
-
--
01 Jan 2006
01 Jan 2006

The Essence of Nested Composition
...
-
, et. al. ...
01 Jan 2018
01 Jan 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning topic emotion and logical semantic for video paragraph captioning

Abstract

Talk to us

Similar Papers

More From: Displays