STFE-Net: A Spatial-Temporal Feature Extraction Network for Continuous Sign Language Translation

Jiwei Hu,Yunfei Liu,Ping Lou,Kin-Man Lam

doi:10.1109/access.2023.3234743

Jiwei Hu, Yunfei Liu + Show 2 more

Open Access

PDF Available

https://doi.org/10.1109/access.2023.3234743

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

The main challenge of continuous sign language translation (CSLT) lies in the extraction of both discriminative spatial features and temporal features. In this paper, a spatial-temporal feature extraction network (STFE-Net) is proposed for CSLT, which optimally fuses spatial and temporal features, extracted by the spatial feature extraction network (SFE-Net) and the temporal feature extraction network (TFE-Net), respectively. SFE-Net performs pose estimation for the presenters in sign-language videos. Based on COCO-WholeBody, 133 key points are abbreviated to 53 key points, according to the characteristics of the sign language. High-resolution pose estimation is performed on the hands, along with the whole-body pose estimation, to obtain finer-grained hand features. The spatial features extracted by SFE-Net and the sign language words are then fed to TFE-Net, which is based on Transformer with relative position encoding. In this paper, a dataset for Chinese continuous sign language was created and used for evaluation. STFE-Net achieves Bilingual Evaluation Understudy (BLEU-1, BLEU-2, BLEU-3, BLEU-4) scores of 77.59, 75.62, 74.25, 72.14, respectively. Furthermore, our proposed STFE-Net was also evaluated on two public datasets, RWTH-Phoenix-Weather 2014T and CLS. The BLEU-1, BLEU-2, BLEU-3 and BLEU-4 scores achieved by our method on the former dataset are 48.22, 33.59, 26.41 and 22.45, respectively, and the corresponding scores are 61.54, 58.76, 57.93 and 57.52, respectively, on the latter dataset. Experiment results show that our model can achieve promising performance. If any reader needs the code or dataset, please email lunfee@whut.edu.cn.

Full Text