Enhancing Neural Sign Language Translation by highlighting the facial expression information

Jiangbin Zheng,Yidong Chen,Chong Wu,Xiaodong Shi,Suhail Muhammad Kamal

doi:10.1016/j.neucom.2021.08.079

Abstract

Neural Sign Language Translation (SLT), which is an important cross-modal task to bridge the communication gap between the deaf and the hearing people, has attracted great attention in the field of artificial intelligence, computer vision, and multimedia, etc. Although some great progress has been achieved recently, current neural SLT models still suffer from translation errors caused by under-consideration of non-manual features such as facial expressions, which can carry critical information during communication among the deaf. This paper aims to enhance the traditional neural SLT models by highlighting the facial expression information in the CNN-based sign video representing part. Two novel schemes have been proposed. The first scheme is based on a Multi-stream Architecture, which extracts and represents the facial expression information in an additional stream and aggregates it with the information from the main stream. The second scheme is a pre-trained scheme based on Regions of Interest (RoIs), which first trains a multi-region detection module for recognizing the features of faces and bodies and then transfers the pre-trained parameters to the module in SLT model. To validate the proposed models, we conducted the experiments upon the publicly available SLT benchmark dataset: RWTH-PHOENIX-Weather-2014T. Experimental results showed that both the above-mentioned schemes can improve the performance of SLT models. Especially, the RoIs-based scheme can achieve an improvement up to 1.6+ BLEU-4 score gains, while the multi-stream scheme quantitatively analyzed the importance of the face mainly through flexible components, providing a sufficient theoretical basis for the RoIs-based scheme.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Enhancing Neural Sign Language Translation by highlighting the facial expression information

Abstract

Talk to us

Similar Papers

More From: Neurocomputing

Lead the way for us

Journal: Neurocomputing	Publication Date: Aug 20, 2021
Citations: 14

Similar Papers

Dynamics of processing invisible faces in the brain: Automatic neural encoding of facial expression information
Yi Jiang ... Sheng He
NeuroImage | VOL. 44
Yi Jiang, et. al.Yi Jiang ... Sheng He
11 Oct 2008
NeuroImage | VOL. 44

Local and Global Perception Generative Adversarial Network for Facial Expression Synthesis
Yifan Xia ... Fei-Yue Wang
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 32
Yifan Xia, et. al.Yifan Xia ... Fei-Yue Wang
20 Apr 2021
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 32

Processing of ambiguous emotional messages in brain injured patients with and without subcortical lesions
Colleen M Karow ... Sara Levitt
Aphasiology | VOL. 27
Colleen M Karow, et. al.Colleen M Karow ... Sara Levitt
01 Mar 2013
Aphasiology | VOL. 27

Interference between conscious and unconscious facial expression information.
Xing Ye ... Ying Hu
PloS one | VOL. 9
Xing Ye, et. al.Xing Ye ... Ying Hu
27 Aug 2014
PloS one | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Enhancing Neural Sign Language Translation by highlighting the facial expression information

Abstract

Talk to us

Similar Papers

More From: Neurocomputing