Exploring Temporal Feature Correlation for Efficient and Stable Video Semantic Segmentation

Matthieu Lin,Yubin Hu,Yong-Jin Liu,Yangguang Li,Andrew Zhao,Gao Huang,Jenny Sheng,Lu Qi

doi:10.1609/aaai.v38i4.28132

Abstract

This paper tackles the problem of efficient and stable video semantic segmentation. While stability has been under-explored, prevalent work in efficient video semantic segmentation uses the keyframe paradigm. They efficiently process videos by only recomputing the low-level features and reusing high-level features computed at selected keyframes. In addition, the reused features stabilize the predictions across frames, thereby improving video consistency. However, dynamic scenes in the video can easily lead to misalignments between reused and recomputed features, which hampers performance. Moreover, relying on feature reuse to improve prediction consistency is brittle; an erroneous alignment of the features can easily lead to unstable predictions. Therefore, the keyframe paradigm exhibits a dilemma between stability and performance. We address this efficiency and stability challenge using a novel yet simple Temporal Feature Correlation (TFC) module. It uses the cosine similarity between two frames’ low-level features to inform the semantic label’s consistency across frames. Specifically, we selectively reuse label-consistent features across frames through linear interpolation and update others through sparse multi-scale deformable attention. As a result, we no longer directly reuse features to improve stability and thus effectively solve feature misalignment. This work provides a significant step towards efficient and stable video semantic segmentation. On the VSPW dataset, our method significantly improves the prediction consistency of image-based methods while being as fast and accurate.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Exploring Temporal Feature Correlation for Efficient and Stable Video Semantic Segmentation

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Similar Papers

Value of Temporal Dynamics Information in Driving Scene Segmentation
Li Ding ... Bryan Reimer
IEEE Transactions on Intelligent Vehicles | VOL. 7
Li Ding, et. al.Li Ding ... Bryan Reimer
01 Mar 2022
IEEE Transactions on Intelligent Vehicles | VOL. 7

Dynamic Scene Classification Using Redundant Spatial Scenelets
Liang Du ... Haibin Ling
IEEE Transactions on Cybernetics | VOL. 46
Liang Du, et. al.Liang Du ... Haibin Ling
20 Aug 2015
IEEE Transactions on Cybernetics | VOL. 46

Using Emergence Phenomenon in Meaningful Image Segmentation for Content-based Image Retrieval
Sagarmay Deb
-
Sagarmay DebSagarmay Deb
19 Apr 2011
19 Apr 2011

An integrated correlation measure for semantic video segmentation
Xiaoye Lu ... Yu-Fei Ma
-
Xiaoye Lu, et. al. Xiaoye Lu ... Yu-Fei Ma
07 Nov 2002
07 Nov 2002

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Exploring Temporal Feature Correlation for Efficient and Stable Video Semantic Segmentation

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence