MWVOS: Mask-Free Weakly Supervised Video Object Segmentation via promptable foundation model

Zhenghao Zhang,Shengfan Zhang,Zuozhuo Dai,Zilong Dong,Siyu Zhu

doi:10.1016/j.patcog.2024.111100

Abstract

The current state-of-the-art techniques for video object segmentation necessitate extensive training on video datasets with mask annotations, thereby constraining their ability to transfer zero-shot learning to new image distributions and tasks. However, recent advancements in foundation models, particularly in the domain of image segmentation, have showcased robust generalization capabilities, introducing a novel prompt-driven paradigm for a variety of downstream segmentation challenges on new data distributions. This study delves into the potential of vision foundation models using diverse prompt strategies and proposes a mask-free approach for unsupervised video object segmentation. To further improve the efficacy of prompt learning in diverse and complex video scenes, we introduce a spatial–temporal decoupled deformable attention mechanism to establish an effective correlation between intra- and inter-frame features. Extensive experiments conducted on the DAVIS2017-unsupervised and YoutubeVIS19&21 and OIVS datasets demonstrate the superior performance of the proposed approach without mask supervision when compared to existing mask-supervised methods, as well as its capacity to generalize to weakly-annotated video datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

MWVOS: Mask-Free Weakly Supervised Video Object Segmentation via promptable foundation model

Abstract

Talk to us

Similar Papers

More From: Pattern Recognition

Lead the way for us

Similar Papers

A Simple and Powerful Global Optimization for Unsupervised Video Object Segmentation
Georgy Ponimatkin ... Yuming Du
-
Georgy Ponimatkin, et. al.Georgy Ponimatkin ... Yuming Du
01 Jan 2023
01 Jan 2023

Video object segmentation aggregation
Tianfei Zhou ... Jian Zhang
-
Tianfei Zhou, et. al.Tianfei Zhou ... Jian Zhang
01 Jul 2016
01 Jul 2016

Joint Video Object Discovery and Segmentation by Coupled Dynamic Markov Networks.
Ziyi Liu ... Qilin Zhang
IEEE Transactions on Image Processing | VOL. 27
Ziyi Liu, et. al.Ziyi Liu ... Qilin Zhang
30 Jul 2018
IEEE Transactions on Image Processing | VOL. 27

CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing
Kevin Duarte ... Mubarak Shah
-
Kevin Duarte, et. al.Kevin Duarte ... Mubarak Shah
01 Oct 2019
01 Oct 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

MWVOS: Mask-Free Weakly Supervised Video Object Segmentation via promptable foundation model

Abstract

Talk to us

Similar Papers

More From: Pattern Recognition