Context-aware Biaffine Localizing Network for Temporal Sentence Grounding

Daizong Liu,Jianfeng Dong,Yu Cheng,Xiaoye Qu,Pan Zhou,Yulai Xie,Wei Wei,Zichuan Xu

doi:10.1109/cvpr46437.2021.01108

Abstract

This paper addresses the problem of temporal sentence grounding (TSG), which aims to identify the temporal boundary of a specific segment from an untrimmed video by a sentence query. Previous works either compare pre-defined candidate segments with the query and select the best one by ranking, or directly regress the boundary timestamps of the target segment. In this paper, we propose a novel localization framework that scores all pairs of start and end indices within the video simultaneously with a biaffine mechanism. In particular, we present a Context-aware Biaffine Localizing Network (CBLN) which incorporates both local and global contexts into features of each start/end position for biaffine-based localization. The local contexts from the adjacent frames help distinguish the visually similar appearance, and the global contexts from the entire video contribute to reasoning the temporal relation. Besides, we also develop a multi-modal self-attention module to provide fine-grained query-guided video representation for this biaffine strategy. Extensive experiments show that our CBLN significantly outperforms state-of-thearts on three public datasets (ActivityNet Captions, TACoS, and Charades-STA), demonstrating the effectiveness of the proposed localization framework. The code is available at https://github.com/liudaizong/CBLN.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Context-aware Biaffine Localizing Network for Temporal Sentence Grounding

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

The Role of Global and Local Contexts in Pronoun Comprehension
Bing Gao
Acta Psychologica Sinica | VOL. 40
Bing GaoBing Gao
19 Sep 2008
Acta Psychologica Sinica | VOL. 40

The Effects of Global and Local Stimulus Context on Auditory Frequency Discrimination
I Tsaliach, ... M Amel,
Journal of Basic and Clinical Physiology and Pharmacology | VOL. 21
I Tsaliach,, et. al.I Tsaliach, ... M Amel,
01 Jun 2010
Journal of Basic and Clinical Physiology and Pharmacology | VOL. 21

Exploiting Auxiliary Caption for Video Grounding
Hongxiang Li ... Yaowei Li
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 38
Hongxiang Li, et. al.Hongxiang Li ... Yaowei Li
24 Mar 2024
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 38

Influential Global and Local Contexts Guided Trace Representation for Fault Localization
Zhuo Zhang ... Xiaoguang Mao
ACM Transactions on Software Engineering and Methodology | VOL. 32
Zhuo Zhang, et. al.Zhuo Zhang ... Xiaoguang Mao
26 Apr 2023
ACM Transactions on Software Engineering and Methodology | VOL. 32

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Context-aware Biaffine Localizing Network for Temporal Sentence Grounding

Abstract

Talk to us

Similar Papers