STA-CNN: Convolutional Spatial-Temporal Attention Learning for Action Recognition.

Hao Yang,Yunda Sun,Stephen J Maybank,Weiming Hu,Chunfeng Yuan,Li Zhang

doi:10.1109/tip.2020.2984904

Abstract

Convolutional Neural Networks have achieved excellent successes for object recognition in still images. However, the improvement of Convolutional Neural Networks over the traditional methods for recognizing actions in videos is not so significant, because the raw videos usually have much more redundant or irrelevant information than still images. In this paper, we propose a Spatial-Temporal Attentive Convolutional Neural Network (STA-CNN) which selects the discriminative temporal segments and focuses on the informative spatial regions automatically. The STA-CNN model incorporates a Temporal Attention Mechanism and a Spatial Attention Mechanism into a unified convolutional network to recognize actions in videos. The novel Temporal Attention Mechanism automatically mines the discriminative temporal segments from long and noisy videos. The Spatial Attention Mechanism firstly exploits the instantaneous motion information in optical flow features to locate the motion salient regions and it is then trained by an auxiliary classification loss with a Global Average Pooling layer to focus on the discriminative non-motion regions in the video frame. The STA-CNN model achieves the state-of-the-art performance on two of the most challenging datasets, UCF-101 (95.8%) and HMDB-51 (71.5%).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

STA-CNN: Convolutional Spatial-Temporal Attention Learning for Action Recognition.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Image Processing

Lead the way for us

Journal: IEEE Transactions on Image Processing	Publication Date: Jan 1, 2020
Citations: 101

Similar Papers

Spatial-Temporal Feature-Based Sports Video Classification
Zengkai Wang
International Journal of Ambient Computing and Intelligence | VOL. 12
Zengkai WangZengkai Wang
01 Oct 2021
International Journal of Ambient Computing and Intelligence | VOL. 12

Summary of fine-grained image recognition based on attention mechanism
Yao Ma ... Min Zhi
-
Yao Ma, et. al.Yao Ma ... Min Zhi
16 Feb 2022
16 Feb 2022

A novel end-to-end model for steering behavior prediction of autonomous ego-vehicles using spatial and temporal attention mechanism
Lei Han ... Zexi Hua
Neurocomputing | VOL. 490
Lei Han, et. al.Lei Han ... Zexi Hua
02 Dec 2021
Neurocomputing | VOL. 490

SCEP—A New Image Dimensional Emotion Recognition Model Based on Spatial and Channel-Wise Attention Mechanisms
Bo Li ... Fang Miao
IEEE Access | VOL. 9
Bo Li, et. al.Bo Li ... Fang Miao
01 Jan 2020
IEEE Access | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

STA-CNN: Convolutional Spatial-Temporal Attention Learning for Action Recognition.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Image Processing