AGPN: Action Granularity Pyramid Network for Video Action Recognition

Yatong Chen,Hongwei Ge,Yuxuan Liu,Liang Sun,Xinye Cai

doi:10.1109/tcsvt.2023.3235522

Abstract

Video action recognition is a fundamental task for video understanding. Action recognition in complex spatio-temporal contexts generally requires fusing of different multi-granularity action information. However, existing works do not consider spatio-temporal information modeling and fusion from the perspective of action granularity. To address this problem, this paper proposes an Action Granularity Pyramid Network (AGPN) for action recognition, which can be flexibly integrated into 2D backbone networks. The core module is the Action Granularity Pyramid Module (AGPM), a hierarchical pyramid structure with residual connections, which is established to fuse multi-granularity action spatio-temporal information. From top to bottom level in the designed pyramid structure, the receptive field decreases and action granularity becomes more refined. To enrich temporal information of the inputs, a Multiple Frame Rate Module (MFM) is proposed to mix different frame rates at a fine-grained pixel-wise level. Moreover, a Spatio-temporal Anchor Module (SAM) is employed to fix spatio-temporal feature anchors to promote the effectiveness of feature extraction. We conduct extensive experiments on three large-scale action recognition datasets, Something-Something V1 & V2 and Kinetics-400. The results demonstrate that our proposed AGPN outperforms the state-of-the-art methods for the tasks of video action recognition.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

AGPN: Action Granularity Pyramid Network for Video Action Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Circuits and Systems for Video Technology

Lead the way for us

Journal: IEEE Transactions on Circuits and Systems for Video Technology	Publication Date: Aug 1, 2023
Citations: 13

Similar Papers

Integrating Gaussian mixture model and dilated residual network for action recognition in videos
Ming Fang ... Jianwei Zhao
Multimedia Systems | VOL. 26
Ming Fang, et. al.Ming Fang ... Jianwei Zhao
20 Aug 2020
Multimedia Systems | VOL. 26

DarkLight Networks for Action Recognition in the Dark
Rui Chen ... Zixi Liang
-
Rui Chen, et. al.Rui Chen ... Zixi Liang
01 Jun 2021
01 Jun 2021

Multipath Attention and Adaptive Gating Network for Video Action Recognition
Haiping Zhang ... Conghao Ma
Neural Processing Letters | VOL. 56
Haiping Zhang, et. al.Haiping Zhang ... Conghao Ma
27 Mar 2024
Neural Processing Letters | VOL. 56

DANet: Semi-supervised differentiated auxiliaries guided network for video action recognition
Guangyu Gao ... A.K Qin
Neural Networks | VOL. 158
Guangyu Gao, et. al.Guangyu Gao ... A.K Qin
17 Nov 2022
Neural Networks | VOL. 158

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

AGPN: Action Granularity Pyramid Network for Video Action Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Circuits and Systems for Video Technology