Global Spatio-Temporal Attention for Action Recognition Based on 3D Human Skeleton Data

Yun Han,Wei You Lin,Sheng-Luen Chung,Qiang Xiao,Shun-Feng Su

doi:10.1109/access.2020.2992740

Yun Han, Wei You Lin + Show 3 more

Open Access

https://doi.org/10.1109/access.2020.2992740

Copy DOI

Abstract

The human skeleton joints captured by RGB-D camera are widely used in action recognition for its robust and comprehensive 3D information. Presently, most action recognition methods based on skeleton joints treat all skeletal joints with the same importance spatially and temporally. However, the contributions of skeletal joints vary significantly. Hence, a GL-LSTM+Diff model is proposed to improve the recognition of human actions. A global spatial attention (GSA) model is proposed to express the different weights for different skeletal joints to provide precise spatial information for human action recognition. The accumulative learning curve (ALC) model is introduced to highlight which frames contribute most to the final decision making by giving varying temporal weights to each intermediate accumulated learning results. By integrating the proposed GSA (for spatial information) and ALC (for temporal processing) models into the LSTM framework and taking the human skeletal joints as inputs, a global spatio-temporal action recognition framework (GL-LSTM) is constructed to recognize human actions. Diff is introduced as the preprocessing method to enhance the dynamic of the features, thus to get distinguishable features in deep learning. Rigorous experiments on the largest dataset NTU RGB+D and the common small dataset SBU show that the algorithm proposed in this paper outperforms other state-of-the-art methods.

Highlights

Human action recognition has a wide range of applications [1], such as human-computer interaction, video surveillance, health care, entertainment, etc
The present paper proposes a global spatio-temporal attention model as shown in Figure 1, which takes all frames of each action as inputs and obtains the weight of each joint for action recognition
In order to further understand the effectiveness of the global spatial attention model, that is, what kind of action type it has better effect on, this paper examines the enhancement on NTU RGB+D dataset and sorts out top 10 actions that have better enhancement

Summary

INTRODUCTION

Human action recognition has a wide range of applications [1], such as human-computer interaction, video surveillance, health care, entertainment, etc. In the sequence of actions, it may be completely different for the importance of each frame sequence to the recognition action; so does the effect of each joint on different actions In response to this problem, the mainstream practice at present is to embed the attention model into deep learning. Only after reading the entire sequence of actions in a complete way, can it be reliable to determine which moments of action are more important and which joints weight greater in action recognition Inspired by this observation, the present paper proposes a global spatio-temporal attention model as shown, which takes all frames of each action as inputs and obtains the weight of each joint for action recognition. Diff is proposed as the basic feature of deep learning, which significantly improves the effect of action recognition

RELATED WORK

HANDCRAFTED DYNAMIC FEATURE

GLOBAL SPATIAL ATTENTION MODEL

INTEGRATION OF SPATIAL MODEL AND TEMPORAL MODEL

DIFF FOR TEMPORAL DYNAMIC FEATURE

EXPERIMENTAL EVALUATION

Findings

CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 52	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Global Spatio-Temporal Attention for Action Recognition Based on 3D Human Skeleton Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Robust Human Action Recognition Using Global Spatial-Temporal Attention for Human Skeleton Data
Yun Han ... Arulmurugan Ambikapathi
-
Yun Han, et. al.Yun Han ... Arulmurugan Ambikapathi
01 Jul 2018
01 Jul 2018

Graph-based approach for 3D human skeletal action recognition
Meng Li ... Howard Leung
Pattern Recognition Letters | VOL. 87
Meng Li, et. al.Meng Li ... Howard Leung
03 Aug 2016
Pattern Recognition Letters | VOL. 87

Robust Human Activity Recognition Using Multimodal Feature-Level Fusion
Muhammad Ehatisham-Ul-Haq ... Ali Javed
IEEE Access | VOL. 7
Muhammad Ehatisham-Ul-Haq, et. al.Muhammad Ehatisham-Ul-Haq ... Ali Javed
01 Jan 2019
IEEE Access | VOL. 7

Skeleton-Based Human Action Recognition through Third-Order Tensor Representation and Spatio-Temporal Analysis
Panagiotis Barmpoutis ... Tania Stathaki
Inventions | VOL. 4
Panagiotis Barmpoutis, et. al.Panagiotis Barmpoutis ... Tania Stathaki
08 Feb 2019
Inventions | VOL. 4

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Global Spatio-Temporal Attention for Action Recognition Based on 3D Human Skeleton Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access