Spatial-Temporal Aggregated Shuffle Attention for Video Instance Segmentation of Traffic Scene

Chongren Zhao,Guangchen Chen,Zifen He,Ying Huang,Yunnan Deng,Yinhui Zhang

doi:10.1587/transinf.2022edp7147

Abstract

Aiming at the problem of spatial focus regions distribution dispersion and dislocation in feature pyramid networks and insufficient feature dependency acquisition in both spatial and channel dimensions, this paper proposes a spatial-temporal aggregated shuffle attention for video instance segmentation (STASA-VIS). First, an mixed subsampling (MS) module to embed activating features from the low-level target area of feature pyramid into the high-level is designed, so as to aggregate spatial information on target area. Taking advantage of the coherent information in video frames, STASA-VIS uses the first ones of every 5 video frames as the key-frames and then propagates the keyframe feature maps of the pyramid layers forward in the time domain, and fuses with the non-keyframe mixed subsampled features to achieve time-domain consistent feature aggregation. Finally, STASA-VIS embeds shuffle attention in the backbone to capture the pixel-level pairwise relationship and dimensional dependencies among the channels and reduce the computation. Experimental results show that the segmentation accuracy of STASA-VIS reaches 41.2%, and the test speed reaches 34FPS, which is better than the state-of-the-art one stage video instance segmentation (VIS) methods in accuracy and achieves real-time segmentation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEICE Transactions on Information and Systems	Publication Date: Feb 1, 2023
Citations: 1	License type: free

R Discovery Prime

R Discovery Prime

Spatial-Temporal Aggregated Shuffle Attention for Video Instance Segmentation of Traffic Scene

Abstract

Talk to us

Similar Papers

More From: IEICE Transactions on Information and Systems

Lead the way for us

Similar Papers

Double Feature Pyramid Networks for Classification and Localization on Object Detection
Qi Yang ... Taiping Zhang
-
Qi Yang, et. al.Qi Yang ... Taiping Zhang
09 Oct 2022
09 Oct 2022

A2-FPN: Attention Aggregation based Feature Pyramid Network for Instance Segmentation
Miao Hu ... Lu Fang
-
Miao Hu, et. al.Miao Hu ... Lu Fang
01 Jun 2021
01 Jun 2021

MGFPN: Enhancing multi-scale feature for object detection
Weiming He ... Yang Cao
Journal of Intelligent & Fuzzy Systems | VOL. 40
Weiming He, et. al.Weiming He ... Yang Cao
01 Jan 2020
Journal of Intelligent & Fuzzy Systems | VOL. 40

A2-FPN for semantic segmentation of fine-resolution remotely sensed images
Rui Li ... Shunyi Zheng
International Journal of Remote Sensing | VOL. 43
Rui Li, et. al.Rui Li ... Shunyi Zheng
01 Feb 2022
International Journal of Remote Sensing | VOL. 43

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Spatial-Temporal Aggregated Shuffle Attention for Video Instance Segmentation of Traffic Scene

Abstract

Talk to us

Similar Papers

More From: IEICE Transactions on Information and Systems