Fusion of Skeleton and RGB Features for RGB-D Human Action Recognition

Xu Weiyao,Wu Muqing,Zhao Min,Xia Ting

doi:10.1109/jsen.2021.3089705

Abstract

The output of Microsoft Kinect is a multimodal signal, which provides RGB videos, depth sequences and skeleton information at the same time, opening up a new opportunity for the research of human action recognition. However, for different single modalities of the signals, how to exploit and fuse useful features of these various sources remains a very challenging problem. Most of the methods based on RGB-D action recognition simply fuse the multimodal features, ignoring the potential semantic relationship between different models. In this paper, we propose a multi-modal action recognition model based on Bilinear Pooling and Attention Network (BPAN), which could effectively fuse multi-modal for RGB-D action recognition. Firstly, we adopt the efficient data preprocessing methods for RGB and skeleton data. Then, we propose a multimodal fusion network combining RGB video and skeleton sequences. The proposed BPAN module could effectively compress the features of RGB and skeleton, and project them into latent subspace to get the fusion features. In the end, a fully connected three-layer perceptron is adopted to obtain the final classification decision. Experimental results on three public datasets demonstrate that our proposed method leads to a more favorable performance compared with the state-of-the-art methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Fusion of Skeleton and RGB Features for RGB-D Human Action Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE Sensors Journal

Lead the way for us

Journal: IEEE Sensors Journal	Publication Date: Sep 1, 2021
Citations: 27

Similar Papers

Multimodal Feature Fusion Model For Rgb-D Action Recognition
Xu Weiyao ... Wu Muqing
-
Xu Weiyao, et. al.Xu Weiyao ... Wu Muqing
05 Jul 2021
05 Jul 2021

Skeleton Feature Fusion Based on Multi-Stream LSTM for Action Recognition
Lei Wang ... Yuncai Liu
IEEE Access | VOL. 6
Lei Wang, et. al.Lei Wang ... Yuncai Liu
01 Jan 2018
IEEE Access | VOL. 6

Multi-modal feature fusion for action recognition in RGB-D sequences
Amir Shahroudy ... Gang Wang
-
Amir Shahroudy, et. al.Amir Shahroudy ... Gang Wang
01 May 2014
01 May 2014

Action Recognition Based on Optimal Joint Selection and Discriminative Depth Descriptor
Haomiao Ni ... Hong Liu
-
Haomiao Ni, et. al.Haomiao Ni ... Hong Liu
01 Jan 2017
01 Jan 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Fusion of Skeleton and RGB Features for RGB-D Human Action Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE Sensors Journal