Starting Point Selection and Multiple-Standard Matching for Video Object Segmentation With Language Annotation

Mingjie Sun,Yao Zhao,Eng Gee Lim,Jimin Xiao

doi:10.1109/tmm.2022.3159403

Abstract

In this paper, we study the language-level video object segmentation where the first-frame language annotation is provided to describe the target object. By taking full advantage of the characteristic that a language label is normally compatible to all frames in a video, the proposed method can choose the most suitable starting frame to mitigate the initialization failure issue. Moreover, apart from extracting the visual feature from a static video frame, a motion-language score based on optical flow is proposed to better represent moving objects. Ultimately, scores of multiple standards are aggregated using an attention-based mechanism to predict the final result. The proposed method is evaluated on four widely-used video object segmentation datasets, including DAVIS 2017, DAVIS 2016, SegTrack V2 and YoutubeObject datasets, and the new state-of-the-art accuracy (mean region similarity) is obtained on both DAVIS 2017 (67.2%) and DAVIS 2016 (83.5%) datasets. Source code will be published together with the paper.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Starting Point Selection and Multiple-Standard Matching for Video Object Segmentation With Language Annotation

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia

Lead the way for us

Journal: IEEE Transactions on Multimedia	Publication Date: Jan 1, 2023
Citations: 5

Similar Papers

Video Segmentation via Object Flow
Yi-Hsuan Tsai ... Ming-Hsuan Yang
-
Yi-Hsuan Tsai, et. al.Yi-Hsuan Tsai ... Ming-Hsuan Yang
01 Jun 2016
01 Jun 2016

Using the Eyes to "See" the Objects
Concetto Spampinato ... Francesca Murabito
-
Concetto Spampinato, et. al.Concetto Spampinato ... Francesca Murabito
13 Oct 2015
13 Oct 2015

Joint Video Object Discovery and Segmentation by Coupled Dynamic Markov Networks.
Ziyi Liu ... Qilin Zhang
IEEE Transactions on Image Processing | VOL. 27
Ziyi Liu, et. al.Ziyi Liu ... Qilin Zhang
30 Jul 2018
IEEE Transactions on Image Processing | VOL. 27

Self-Supervised Deep TripleNet for Video Object Segmentation
Kai Xu ... Guorong Li
IEEE Transactions on Multimedia | VOL. 23
Kai Xu, et. al.Kai Xu ... Guorong Li
26 Sep 2020
IEEE Transactions on Multimedia | VOL. 23

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Starting Point Selection and Multiple-Standard Matching for Video Object Segmentation With Language Annotation

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia