A robust and efficient method for skeleton‐based human action recognition and its application for cross‐dataset evaluation

Tien‐Thanh Nguyen,Dinh‐Tan Pham,Thi‐Lan Le,Hai Vu

doi:10.1049/cvi2.12119

Tien‐Thanh Nguyen, Dinh‐Tan Pham + Show 2 more

Open Access

https://doi.org/10.1049/cvi2.12119

Copy DOI

Abstract

Skeleton-based human action recognition has emerged recently thanks to its compactness and robustness to appearance variations. Although impressive results have been obtained in recent years, the performance of skeleton-based action recognition methods has to be improved to be deployed in real-time applications. Recently, a lightweight network structure named Double-feature Double-motion Network (DD-Net) has been proposed for the skeleton-based human action recognition. With high speed, the DD-Net achieves state-of-the-art performance on hand and body actions. The DD-Net could not distinguish actions if they have a weak connection with the global trajectories. However, the DD-Net is suitable for human action recognition where actions strongly correlate to the global trajectories. In this paper, the authors propose TD-Net, an improved version of the DD-Net in which a new branch is added. The new branch takes the normalised coordinates of joints (NCJ) to enrich the spatial information. On five datasets for skeleton-based human activity recognition that are MSR-Action3D, CMDFall, JHMDB, FPHAB, and NTU RGB + D, the TD-Net consistently obtains superior performance compared with the baseline model DD-Net. The proposed method outperforms different state-of-the-art methods, including both hand-designed and deep learning-based methods on four datasets (MSR-Action3D, CMDFall, JHMDB, and FPHAB). Furthermore, the generalisation of the proposed method is confirmed through cross-dataset evaluation. To illustrate the potential use of the model for real-time human action recognition, the authors have deployed an application on an edge device. The experimental result shows that the application can process up to 40 fps for pose estimation using MediaPipe. It takes only 0.04 ms to recognise an action from skeleton sequences.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IET Computer Vision	Publication Date: Jul 6, 2022
Citations: 12	License type: CC BY-NC-ND 4.0

R Discovery Prime

R Discovery Prime

A robust and efficient method for skeleton‐based human action recognition and its application for cross‐dataset evaluation

Abstract

Talk to us

Similar Papers

More From: IET Computer Vision

Lead the way for us

Similar Papers

Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates.
Jun Liu ... Alex C Kot
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 40
Jun Liu, et. al.Jun Liu ... Alex C Kot
09 Nov 2017
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 40

A survey of video datasets for human action and activity recognition
Jose M Chaquet ... Antonio Fernández-Caballero
Computer Vision and Image Understanding | VOL. 117
Jose M Chaquet, et. al.Jose M Chaquet ... Antonio Fernández-Caballero
13 Feb 2013
Computer Vision and Image Understanding | VOL. 117

3-D dataset for Human Activity Recognition in video surveillance
M.M Sardsehmukh ... P.N Chatur
-
M.M Sardsehmukh, et. al.M.M Sardsehmukh ... P.N Chatur
01 Dec 2014
01 Dec 2014

An Approach to Extract and Compare Metadata of Human Activity Recognition (HAR) Data Sets
Gulzar Alam ... Joseph Rafferty
-
Gulzar Alam, et. al.Gulzar Alam ... Joseph Rafferty
21 Nov 2022
21 Nov 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A robust and efficient method for skeleton‐based human action recognition and its application for cross‐dataset evaluation

Abstract

Talk to us

Similar Papers

More From: IET Computer Vision