Attentive 3D-Ghost Module for Dynamic Hand Gesture Recognition with Positive Knowledge Transfer.

Jinghua Li,Baocai Yin,Shaofan Wang,Ronghua Gao,Dehui Kong,Lichun Wang,Runze Liu

doi:10.1155/2021/5044916

Abstract

Hand gesture recognition is a challenging topic in the field of computer vision. Multimodal hand gesture recognition based on RGB-D is with higher accuracy than that of only RGB or depth. It is not difficult to conclude that the gain originates from the complementary information existing in the two modalities. However, in reality, multimodal data are not always easy to acquire simultaneously, while unimodal RGB or depth hand gesture data are more general. Therefore, one hand gesture system is expected, in which only unimordal RGB or Depth data is supported for testing, while multimodal RGB-D data is available for training so as to attain the complementary information. Fortunately, a kind of method via multimodal training and unimodal testing has been proposed. However, unimodal feature representation and cross-modality transfer still need to be further improved. To this end, this paper proposes a new 3D-Ghost and Spatial Attention Inflated 3D ConvNet (3DGSAI) to extract high-quality features for each modality. The baseline of 3DGSAI network is Inflated 3D ConvNet (I3D), and two main improvements are proposed. One is 3D-Ghost module, and the other is the spatial attention mechanism. The 3D-Ghost module can extract richer features for hand gesture representation, and the spatial attention mechanism makes the network pay more attention to hand region. This paper also proposes an adaptive parameter for positive knowledge transfer, which ensures that the transfer always occurs from the strong modality network to the weak one. Extensive experiments on SKIG, VIVA, and NVGesture datasets demonstrate that our method is competitive with the state of the art. Especially, the performance of our method reaches 97.87% on the SKIG dataset using only RGB, which is the current best result.

Highlights

Hand gesture is one of the most natural interaction ways, and hand gesture recognition based on video aims to attain the symbol describing hand gesture action automatically
Reference [21] adopted a temporal feature representation method and proposed multikernel temporal block (MKTB) and global refinement block (GRB) by modeling time series and combining the two blocks to effectively explore the spatiotemporal feature representation of hand gestures. e feature representation of the proposed framework is based on Inflated 3D ConvNet (I3D) network, and we proposed 3DGSAI network, which aims to obtain more effective features to realize high-performance single-modality gesture recognition
We describe the proposed methods in detail. e dynamic hand gesture recognition task of this paper is defined as follows: recognizing dynamic hand gesture only by unimodal RGB data 􏼈xmi, y􏼉 or Depth ones 􏼈xni, y􏼉 when testing, but to levegage multimodal knowledge to improve the unimodal recognition accuracy, multimodal RGB-D hand gesture video sequence 􏼈xmi, xni, yi􏼉 are both used for training. e method in this paper is not limited to the two modalities of RGB and depth and can be extended to more modalities

Summary

Introduction

Hand gesture is one of the most natural interaction ways, and hand gesture recognition based on video aims to attain the symbol describing hand gesture action automatically. One expected approach is to utilize RGB-D multimodal data to train model so as to receive more knowledge, while in the test stage, hand gesture can be recognized based on the well-trained multimodal model with only one kind of modality information. To this end, Abavisani et al [5] proposed a MTUT model, which is a feasible approach for the above idea, in which I3D network is selected as the baseline representing each modality [6], and minimizing semantic loss is proposed to implement the cross-modality

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computational Intelligence and Neuroscience	Publication Date: Jan 1, 2021
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Attentive 3D-Ghost Module for Dynamic Hand Gesture Recognition with Positive Knowledge Transfer.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computational Intelligence and Neuroscience

Lead the way for us

Similar Papers

Dynamic Hand Gesture Recognition using Convolutional Neural Network with RGB-D Fusion
Bindu Verma ... Ayesha Choudhary
-
Bindu Verma, et. al.Bindu Verma ... Ayesha Choudhary
18 Dec 2018
18 Dec 2018

Real-time hand gesture detection and recognition for human computer interaction

-

01 Jan 2012
01 Jan 2012

A Two-Stream CNN Framework for American Sign Language Recognition Based on Multimodal Data Fusion
Qing Gao ... Uchenna Emeoha Ogenyi
-
Qing Gao, et. al.Qing Gao ... Uchenna Emeoha Ogenyi
30 Aug 2019
30 Aug 2019

Real-time vision-based hand tracking and gesture recognition

-

01 Jan 2008
01 Jan 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Attentive 3D-Ghost Module for Dynamic Hand Gesture Recognition with Positive Knowledge Transfer.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computational Intelligence and Neuroscience