Moment is Important: Language-Based Video Moment Retrieval via Adversarial Learning

Yawen Zeng,Shaofei Lu,Jiao Xu,Zheng Qin,Hanling Zhang,Da Cao

doi:10.1145/3478025

Yawen Zeng, Shaofei Lu + Show 4 more

https://doi.org/10.1145/3478025

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

The newly emerging language-based video moment retrieval task aims at retrieving a target video moment from an untrimmed video given a natural language as the query. It is more applicable in reality since it is able to accurately localize a specific video moment, as compared to traditional whole video retrieval. In this work, we propose a novel solution to thoroughly investigate the language-based video moment retrieval issue under the adversarial learning. The key of our solution is to formulate the language-based video moment retrieval task as an adversarial learning problem with two tightly connected components. Specifically, a reinforcement learning is employed as a generator to produce a set of possible video moments. Meanwhile, a multi-task learning is utilized as a discriminator, which integrates inter-modal and intra-modal in a unified framework by employing a sequential update strategy. Finally, the generator and the discriminator are mutually reinforced in the adversarial learning, which is able to jointly optimize the performance of both video moment ranking and video moment localization. Extensive experimental results on two challenging benchmarks, i.e., Charades-STA and TACoS datasets, have well demonstrated the effectiveness and rationality of our proposed solution. Meanwhile, on the larger and unbiased datasets, i.e., ActivityNet Captions and ActivityNet-CD, our proposed framework exhibits excellent robustness.

Full Text

Published Version

Check institute access

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Moment is Important: Language-Based Video Moment Retrieval via Adversarial Learning

Abstract

Published Version

Talk to us

Similar Papers

More From: ACM Transactions on Multimedia Computing, Communications, and Applications

Lead the way for us

Journal: ACM Transactions on Multimedia Computing, Communications, and Applications	Publication Date: Feb 16, 2022
Citations: 17

Similar Papers

Adversarial Video Moment Retrieval by Jointly Modeling Ranking and Localization
Da Cao ... Zheng Qin
-
Da Cao, et. al.Da Cao ... Zheng Qin
12 Oct 2020
12 Oct 2020

Multi-Scale 2D Temporal Adjacency Networks for Moment Localization With Natural Language.
Songyang Zhang ... Jiebo Luo
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 44
Songyang Zhang, et. al.Songyang Zhang ... Jiebo Luo
01 Dec 2022
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 44

Learning 2D Temporal Adjacent Networks for Moment Localization with Natural Language
Songyang Zhang ... Jiebo Luo
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 34
Songyang Zhang, et. al.Songyang Zhang ... Jiebo Luo
03 Apr 2020
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 34

Video Moment Retrieval via Comprehensive Relation-Aware Network
Xin Sun ... Xuan Wang
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 33
Xin Sun, et. al.Xin Sun ... Xuan Wang
01 Sep 2023
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Moment is Important: Language-Based Video Moment Retrieval via Adversarial Learning

Abstract

Published Version

Talk to us

Similar Papers

More From: ACM Transactions on Multimedia Computing, Communications, and Applications