Spatial-temporal interaction learning based two-stream network for action recognition

Tianyu Liu,Yujun Ma,Wenhan Yang,Wanting Ji,Ruili Wang,Ping Jiang

doi:10.1016/j.ins.2022.05.092

Abstract

Two-stream convolutional neural networks have been widely applied to action recognition. However, two-stream networks are usually adopted to capture spatial information and temporal information separately, which normally ignore the strong complementarity and correlation between spatial and temporal information in videos. To solve this problem, we propose a Spatial-Temporal Interaction Learning Two-stream network (STILT) for action recognition. Our proposed two-stream (i.e., a spatial stream and a temporal stream) network has a spatial–temporal interaction learning module, which uses an alternating co-attention mechanism between two streams to learn the correlation between spatial features and temporal features. The spatial–temporal interaction learning module allows the two streams to guide each other and then generates optimized spatial attention features and temporal attention features. Thus, the proposed network can establish the interactive connection between two streams, which efficiently exploits the attended spatial and temporal features to improve recognition accuracy. Experiments on three widely used datasets (i.e., UCF101, HMDB51 and Kinetics) show that the proposed network outperforms the state-of-the-art models in action recognition.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Spatial-temporal interaction learning based two-stream network for action recognition

Abstract

Talk to us

Similar Papers

More From: Information Sciences

Lead the way for us

Journal: Information Sciences	Publication Date: May 28, 2022
Citations: 36

Similar Papers

Learning Video Actions in Two Stream Recurrent Neural Network
Ehtesham Hassan
Pattern Recognition Letters | VOL. 151
Ehtesham HassanEhtesham Hassan
01 Nov 2021
Pattern Recognition Letters | VOL. 151

Improved two-stream model for human action recognition
Yuxuan Zhao ... Kamran Siddique
EURASIP Journal on Image and Video Processing | VOL. 2020
Yuxuan Zhao, et. al.Yuxuan Zhao ... Kamran Siddique
17 Jun 2020
EURASIP Journal on Image and Video Processing | VOL. 2020

Symmetrical Enhanced Fusion Network for Skeleton-Based Action Recognition
Jun Kong ... Haoyang Deng
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 31
Jun Kong, et. al.Jun Kong ... Haoyang Deng
13 Jan 2021
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 31

Learning Long-Term Temporal Features With Deep Neural Networks for Human Action Recognition
Sheng Yu ... Daoxun Xia
IEEE Access | VOL. 8
Sheng Yu, et. al.Sheng Yu ... Daoxun Xia
01 Jan 2020
IEEE Access | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Spatial-temporal interaction learning based two-stream network for action recognition

Abstract

Talk to us

Similar Papers

More From: Information Sciences