Self-Supervised Representation Learning With Spatial-Temporal Consistency for Sign Language Recognition.

Weichao Zhao,Wengang Zhou,Hezhen Hu,Min Wang,Houqiang Li

doi:10.1109/tip.2024.3416881

Abstract

Recently, there have been efforts to improve the performance in sign language recognition by designing self-supervised learning methods. However, these methods capture limited information from sign pose data in a frame-wise learning manner, leading to sub-optimal solutions. To this end, we propose a simple yet effective self-supervised contrastive learning framework to excavate rich context via spatial-temporal consistency from two distinct perspectives and learn instance discriminative representation for sign language recognition. On one hand, since the semantics of sign language are expressed by the cooperation of fine-grained hands and coarse-grained trunks, we utilize both granularity information and encode them into latent spaces. The consistency between hand and trunk features is constrained to encourage learning consistent representation of instance samples. On the other hand, inspired by the complementary property of motion and joint modalities, we first introduce first-order motion information into sign language modeling. Additionally, we further bridge the interaction between the embedding spaces of both modalities, facilitating bidirectional knowledge transfer to enhance sign language representation. Our method is evaluated with extensive experiments on four public benchmarks, and achieves new state-of-the-art performance with a notable margin. The source code is publicly available at https://github.com/sakura/Code.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Self-Supervised Representation Learning With Spatial-Temporal Consistency for Sign Language Recognition.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

Lead the way for us

Similar Papers

CaSS: A Channel-Aware Self-supervised Representation Learning Framework for Multivariate Time Series Classification
Yijiang Chen ... Minyang Xu
-
Yijiang Chen, et. al.Yijiang Chen ... Minyang Xu
01 Jan 2021
01 Jan 2021

A Novel Solution for EEG-based Emotion Recognition
Zhuofan Xie ... Haixin Sun
-
Zhuofan Xie, et. al.Zhuofan Xie ... Haixin Sun
13 Oct 2021
13 Oct 2021

A Novel Multi-Task Self-Supervised Representation Learning Paradigm
Yinggang Li ... Qi Zhang
Control theory & applications | VOL. -
Yinggang Li, et. al.Yinggang Li ... Qi Zhang
28 May 2021
Control theory & applications | VOL. -

Self-Supervised Representation Learning for Video Quality Assessment
Shaojie Jiang ... Lixiong Liu
IEEE Transactions on Broadcasting | VOL. 69
Shaojie Jiang, et. al.Shaojie Jiang ... Lixiong Liu
01 Mar 2023
IEEE Transactions on Broadcasting | VOL. 69

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Self-Supervised Representation Learning With Spatial-Temporal Consistency for Sign Language Recognition.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on image processing : a publication of the IEEE Signal Processing Society