Gloss Prior Guided Visual Feature Learning for Continuous Sign Language Recognition.

Leming Guo,Dimitris Metaxas,Wanli Xue,Tiantian Yuan,Bo Liu,Kaihua Zhang

doi:10.1109/tip.2024.3404869

Abstract

Continuous sign language recognition (CSLR) is to recognize the glosses in a sign language video. Enhancing the generalization ability of CSLR's visual feature extractor is a worthy area of investigation. In this paper, we model glosses as priors that help to learn more generalizable visual features. Specifically, the signer-invariant gloss feature is extracted by a pre-trained gloss BERT model. Then we design a gloss prior guidance network (GPGN). It contains a novel parallel densely-connected temporal feature extraction (PDC-TFE) module for multi-resolution visual feature extraction. The PDC-TFE captures the complex temporal patterns of the glosses. The pre-trained gloss feature guides the visual feature learning through a cross-modality matching loss. We propose to formulate the cross-modality feature matching into a regularized optimal transport problem, it can be efficiently solved by a variant of the Sinkhorn algorithm. The GPGN parameters are learned by optimizing a weighted sum of the cross-modality matching loss and CTC loss. The experiment results on German and Chinese sign language benchmarks demonstrate that the proposed GPGN achieves competitive performance. The ablation study verifies the effectiveness of several critical components of the GPGN. Furthermore, the proposed pre-trained gloss BERT model and cross-modality matching can be seamlessly integrated into other RGB-cue-based CSLR methods as plug-and-play formulations to enhance the generalization ability of the visual feature extractor.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Gloss Prior Guided Visual Feature Learning for Continuous Sign Language Recognition.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

Lead the way for us

Journal: IEEE transactions on image processing : a publication of the IEEE Signal Processing Society	Publication Date: Jan 1, 2024
Citations: 1

Similar Papers

Spatial–temporal transformer for end-to-end sign language recognition
Zhenchao Cui ... Zhaoqi Wang
Complex & Intelligent Systems | VOL. 9
Zhenchao Cui, et. al.Zhenchao Cui ... Zhaoqi Wang
03 Feb 2023
Complex & Intelligent Systems | VOL. 9

SLRFormer: Continuous Sign Language Recognition Based on Vision Transformer
Feng Xiao ... Tiantian Yuan
-
Feng Xiao, et. al.Feng Xiao ... Tiantian Yuan
18 Oct 2022
18 Oct 2022

Recognitionwith raw canonical phonetic movement and handshape subunits on videos of continuous Sign Language
Stavros Theodorakis ... Isidoros Rodomagoulakis
-
Stavros Theodorakis, et. al.Stavros Theodorakis ... Isidoros Rodomagoulakis
01 Sep 2012
01 Sep 2012

Continuous Sign Language Recognition through a Context-Aware Generative Adversarial Network.
Ilias Papastratis ... Petros Daras
Sensors | VOL. 21
Ilias Papastratis, et. al.Ilias Papastratis ... Petros Daras
01 Apr 2021
Sensors | VOL. 21

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Gloss Prior Guided Visual Feature Learning for Continuous Sign Language Recognition.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on image processing : a publication of the IEEE Signal Processing Society