Dynamic Spatial Focus for Efficient Compressed Video Action Recognition

Ziwei Zheng,Gao Huang,Le Yang,Yulin Wang,Miao Zhang,Lijun He,Fan Li

doi:10.1109/tcsvt.2023.3287201

Abstract

Recent years have witnessed a growing interest in compressed video action recognition due to the rapid growth of online videos. It remarkably reduces the storage by replacing raw videos with sparsely sampled RGB frames and other compressed motion cues (motion vectors and residuals). However, existing compressed video action recognition methods face two main issues: First, the inefficiency caused by the usage of coarse-level information under full resolution, and second, the disturbing due to the noisy dynamics in motion vectors. To address the two issues, this paper proposes a dynamic spatial focus method for efficient compressed video action recognition (CoViFocus). Specifically, we first use a light-weighted two-stream architecture to localize the task-relevant patches for both the RGB frames and motion vectors. Then the selected patch pair will be processed by a high-capacity two-stream deep model for the final prediction. Such a patch selection strategy crops out the irrelevant motion noise in motion vectors, as well as reduces the spatial redundancy of the inputs, leading to the high efficiency of our method in the compressed domain. Moreover, we found that the motion vectors can help our method to address the possibly happened <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">static-issue</i> , which means that the focus patches get stuck at some regions related to static objects rather than target actions, which further improves our method. Extensive results on both the HMDB-51 and UCF-101 datasets demonstrate the effectiveness and efficiency of our method in compressed video action recognition tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Dynamic Spatial Focus for Efficient Compressed Video Action Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Circuits and Systems for Video Technology

Lead the way for us

Journal: IEEE Transactions on Circuits and Systems for Video Technology	Publication Date: Feb 1, 2024
Citations: 9

Similar Papers

Efficient motion estimation methods for fast recognition of activities of daily living
Stergios Poularakis ... Ioannis Kompatsiaris
Signal Processing: Image Communication | VOL. 53
Stergios Poularakis, et. al.Stergios Poularakis ... Ioannis Kompatsiaris
26 Jan 2017
Signal Processing: Image Communication | VOL. 53

Examination of a tracking and detection method using compressed domain information
Erii Maekawa ... Satoshi Goto
-
Erii Maekawa, et. al.Erii Maekawa ... Satoshi Goto
01 Dec 2013
01 Dec 2013

Probability-based motion analysis using bidirectional prediction-independent framework in the compressed domain
Nac-Woo Kim
Optical Engineering | VOL. 44
Nac-Woo KimNac-Woo Kim
01 Jun 2005
Optical Engineering | VOL. 44

Crowd flow segmentation based on motion vectors in H.264 compressed domain
R Gnana Praveen ... R Venkatesh Babu
-
R Gnana Praveen, et. al.R Gnana Praveen ... R Venkatesh Babu
01 Jan 2014
01 Jan 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Dynamic Spatial Focus for Efficient Compressed Video Action Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Circuits and Systems for Video Technology