Visual-semantic network: a visual and semantic enhanced model for gesture recognition

Yizhe Wang,Congqi Cao,Yanning Zhang

doi:10.1007/s44267-023-00027-6

Yizhe Wang, Congqi Cao + Show 1 more

Open Access

https://doi.org/10.1007/s44267-023-00027-6

Copy DOI

Journal: Visual Intelligence	Publication Date: Oct 23, 2023
Citations: 1	License type: CC BY 4.0

Affiliation: Northwestern Polytechnical University

Abstract

Gesture recognition has attracted considerable attention and made encouraging progress in recent years due to its great potential in applications. However, the spatial and temporal modeling in gesture recognition is still a problem to be solved. Specifically, existing works lack efficient temporal modeling and effective spatial attention capacity. To efficiently model temporal information, we first propose a long- and short-term temporal shift module (LS-TSM) that models the long-term and short-term temporal information simultaneously. Then, we propose a spatial attention module (SAM) that focuses on where the change primarily occurs to obtain effective spatial attention capacity. In addition, the semantic relationship among gestures is helpful in gesture recognition. However, this is usually neglected by previous works. Therefore, we propose a label relation module (LRM) that takes full advantage of the relationship among classes based on their labels’ semantic information. To explore the best form of LRM, we design four different semantic reconstruction methods to incorporate the semantic relationship information into the class label’s semantic space. We perform extensive ablation studies to analyze the best settings of each module. The best form of LRM is utilized to build our visual-semantic network (VS Network), which achieves the state-of-the-art performance on two gesture datasets, i.e., EgoGesture and NVGesture.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Visual-semantic network: a visual and semantic enhanced model for gesture recognition

Abstract

Talk to us

Similar Papers

More From: Visual Intelligence

Lead the way for us

Similar Papers

Beyond Vision: A Semantic Reasoning Enhanced Model for Gesture Recognition with Improved Spatiotemporal Capacity
Yizhe Wang ... Congqi Cao
-
Yizhe Wang, et. al.Yizhe Wang ... Congqi Cao
01 Jan 2021
01 Jan 2021

Dynamic Gesture Recognition Based on 3D Separable Convolutional LSTM Networks
Xunlei Zhang ... Lin Qi
-
Xunlei Zhang, et. al.Xunlei Zhang ... Lin Qi
16 Oct 2020
16 Oct 2020

Enhanced global attention upsample decoder based on enhanced spatial attention and feature aggregation module for semantic segmentation
Lianglu Yin ... Haifeng Hu
Electronics Letters | VOL. 56
Lianglu Yin, et. al.Lianglu Yin ... Haifeng Hu
01 Jun 2020
Electronics Letters | VOL. 56

AF-SSD: An Accurate and Fast Single Shot Detector for High Spatial Remote Sensing Imagery.
Ruihong Yin ... Yongfeng Yin
Sensors | VOL. 20
Ruihong Yin, et. al.Ruihong Yin ... Yongfeng Yin
15 Nov 2020
Sensors | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Visual-semantic network: a visual and semantic enhanced model for gesture recognition

Abstract

Talk to us

Similar Papers

More From: Visual Intelligence