Bridging spatiotemporal feature gap for video salient object detection

Zhenshan Tan,Cheng Chen,Keyu Wen,Zhangjie Fu

doi:10.1016/j.knosys.2024.112505

Abstract

The mutual transfer of spatiotemporal features is the main challenge for the two-stream video salient object detection. Current methods adopt the spatiotemporal feature interaction to achieve it. However, these methods still have two issues: modal feature gap and layer feature gap. To address these, we propose a Bridging Spatiotemporal feature Gap Network (BSGNet) with a global correspondence interaction and gate filtering (GCGF) module, a global-local distribution consistency (GLDC) module, and a modality-layer feature fusion framework (MLFF). Compared with previous works, BSGNet not only explores more effective interaction by GCGF, but also bridges modality and layer feature gaps by GLDC and MLFF. Firstly, GCGF achieves the spatiotemporal feature interaction by modeling intra-modal and inter-modal global correspondences. Besides, GCGF employs a gate mechanism to control the proportion of message transfer between appearance and motion information, which characterizes the contribution provided by spatiotemporal features. Secondly, at both global and local levels, GLDC pushes the spatiotemporal feature distribution between same scenes, and pulls the spatiotemporal feature distribution between different scenes. This can enhance the distribution consistency to align spatiotemporal features and bridge modal feature gap. Finally, MLFF designs an inter-modal and inter-layer feature fusion framework to bridge the layer feature gap brought by the different modalities and different receptive fields. Extensive experiments on five benchmarks reveal that our BSGNet outperforms state-of-the-arts.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Bridging spatiotemporal feature gap for video salient object detection

Abstract

Talk to us

Similar Papers

More From: Knowledge-Based Systems

Lead the way for us

Similar Papers

Interpretable Multi-Branch Architecture for Spatiotemporal Neural Networks and Its Application in Seizure Prediction.
Baolian Shan ... Dong Ming
IEEE journal of biomedical and health informatics | VOL. PP
Baolian Shan, et. al.Baolian Shan ... Dong Ming
01 Jan 2024
IEEE journal of biomedical and health informatics | VOL. PP

Spatiotemporal feature learning for no-reference gaming content video quality assessment
Ngai-Wing Kwong ... Kin-Man Lam
Journal of Visual Communication and Image Representation | VOL. 100
Ngai-Wing Kwong, et. al.Ngai-Wing Kwong ... Kin-Man Lam
13 Mar 2024
Journal of Visual Communication and Image Representation | VOL. 100

Vehicle trajectory prediction based on cross-attention and multilevel spatio-temporal features
Haifeng Sang ... Wangxing Chen
Proceedings of the Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering | VOL. -
Haifeng Sang, et. al.Haifeng Sang ... Wangxing Chen
20 Sep 2024
Proceedings of the Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering | VOL. -

A fast adaptive spatio-temporal 3D feature for video-based person re-identification
Zheng Liu ... Jiaxin Chen
-
Zheng Liu, et. al.Zheng Liu ... Jiaxin Chen
01 Sep 2016
01 Sep 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bridging spatiotemporal feature gap for video salient object detection

Abstract

Talk to us

Similar Papers

More From: Knowledge-Based Systems