Human-centric multimodal fusion network for robust action recognition

Zesheng Hu,Jian Xiao,Le Li,Cun Liu,Genlin Ji

doi:10.1016/j.eswa.2023.122314

Abstract

Skeleton-based methods have made remarkable strides in human action recognition (HAR). However, the performance of existing unimodal approaches is still limited by the lack of diverse visual features in skeleton data. Concretely, due to the absence of interaction information between individuals and objects, skeleton-based methods tend to confuse similar actions. Moreover, the view invariant property of unimodal models is susceptible to restrictions. In this work, we propose an innovative skeleton-guided multimodal data fusion methodology that transforms depth, RGB, and optical flow modalities into human-centric images (HCI) based on keypoint sequences. Building upon this foundation, we introduce a human-centric multimodal fusion network (HCMFN), which can comprehensively extract the action patterns of different modalities. Our model significantly enhances the performance of skeleton-based techniques, achieving remarkable results with rapid inference speed. Extensive experiments on two large-scale multimodal datasets, namely NTU RGB+D and NTU RGB+D 120, validate the capacity of HCMFN to bolster the robustness of skeleton-based methods in two challenging HAR tasks: (1) discriminating between actions with subtle inter-class differences, and (2) recognizing actions from varying viewpoints. Compared to state-of-the-art multimodal methods, our HCMFN achieves exciting results.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Human-centric multimodal fusion network for robust action recognition

Abstract

Talk to us

Similar Papers

More From: Expert Systems with Applications

Lead the way for us

Journal: Expert Systems with Applications	Publication Date: Oct 31, 2023
Citations: 2

Similar Papers

SGM-Net: Skeleton-guided multimodal network for action recognition
Jianan Li ... Guangming Shi
Pattern Recognition | VOL. 104
Jianan Li, et. al.Jianan Li ... Guangming Shi
09 Apr 2020
Pattern Recognition | VOL. 104

Applying 5PKC-Based Skeleton Partition Strategy into Spatio-Temporal Graph Convolution Networks for Fitness Action Recognition
Jia-Wei Chang ... Hao-Ran Liu
-
Jia-Wei Chang, et. al.Jia-Wei Chang ... Hao-Ran Liu
01 Jan 2023
01 Jan 2023

A Dense-Sparse Complementary Network for Human Action Recognition based on RGB and Skeleton Modalities
Qin Cheng ... Jianming Liu
Expert Systems With Applications | VOL. 244
Qin Cheng, et. al.Qin Cheng ... Jianming Liu
28 Dec 2023
Expert Systems With Applications | VOL. 244

A Novel Two-Stream Transformer-Based Framework for Multi-Modality Human Action Recognition
Jing Shi ... Yuanyuan Zhang
Applied Sciences | VOL. 13
Jing Shi, et. al.Jing Shi ... Yuanyuan Zhang
05 Feb 2023
Applied Sciences | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Human-centric multimodal fusion network for robust action recognition

Abstract

Talk to us

Similar Papers

More From: Expert Systems with Applications