Locality-Aware Transformer for Video-Based Sign Language Translation

Zihui Guo,Yonghong Hou,Wenjie Yin,Chunping Hou

doi:10.1109/lsp.2023.3263808

Abstract

Recently, the application of transformer makes significant progress in sign language translation. However, several characteristics of sign videos are neglected in existing transformer-based methods that hinder translation performance. Firstly, in sign videos, multiple consecutive frames represent a single sign gloss thus the local temporal relations are crucial. Secondly, the inconsistency between video and text demands the non-local and global context modeling ability of the model. To address these issues, a locality-aware transformer is proposed for sign language translation. Concretely, the multi-stride position encoding scheme assigns the same position index to adjacent frames with various strides to enhance the local dependency. Afterward, the adaptive temporal interaction module is utilized to capture non-local and flexible local frame correlation simultaneously. Moreover, a gloss counting task is designed to facilitate the holistic understanding of sign videos. Experimental results on two benchmark datasets demonstrate the effectiveness of the proposed framework.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Locality-Aware Transformer for Video-Based Sign Language Translation

Abstract

Talk to us

Similar Papers

More From: IEEE Signal Processing Letters

Lead the way for us

Journal: IEEE Signal Processing Letters	Publication Date: Jan 1, 2023
Citations: 7

Similar Papers

CSLT-AK: Convolutional-embedded transformer with an action tokenizer and keypoint emphasizer for sign language translation
Jungeun Kim ... Ha Young Kim
Pattern Recognition Letters | VOL. 173
Jungeun Kim, et. al.Jungeun Kim ... Ha Young Kim
21 Aug 2023
Pattern Recognition Letters | VOL. 173

Explore More Guidance: A Task-aware Instruction Network for Sign Language Translation Enhanced with Data Augmentation
Yong Cao ... Xianzhi Li
-
Yong Cao, et. al.Yong Cao ... Xianzhi Li
01 Jan 2021
01 Jan 2021

Explore More Guidance: A Task-aware Instruction Network for Sign Language Translation Enhanced with Data Augmentation

-

27 Jun 2022
27 Jun 2022

Contrastive Learning for Sign Language Recognition and Translation
Shiwei Gan ... Yafeng Yin
-
Shiwei Gan, et. al.Shiwei Gan ... Yafeng Yin
01 Aug 2023
01 Aug 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Locality-Aware Transformer for Video-Based Sign Language Translation

Abstract

Talk to us

Similar Papers

More From: IEEE Signal Processing Letters