Re-Synchronization Using the Hand Preceding Model for Multi-Modal Fusion in Automatic Continuous Cued Speech Recognition

Li Liu,Xiao-Ping Zhang,Gang Feng,Denis Beautemps

doi:10.1109/tmm.2020.2976493

Abstract

Cued Speech (CS) is an augmented lip reading system complemented by hand coding, and it is very helpful to the deaf people. Automatic CS recognition can help communications between the deaf people and others. Due to the asynchronous nature of lips and hand movements, fusion of them in automatic CS recognition is a challenging problem. In this work, we propose a novel re-synchronization procedure for multi-modal fusion, which aligns the hand features with lips feature. It is realized by delaying hand position and hand shape with their optimal hand preceding time which is derived by investigating the temporal organizations of hand position and hand shape movements in CS. This re-synchronization procedure is incorporated into a practical continuous CS recognition system that combines convolutional neural network (CNN) with multi-stream hidden markov model (MSHMM). A significant improvement of about 4.6% has been achieved retaining 76.6% CS phoneme recognition correctness compared with the state-of-the-art architecture (72.04%), which did not take into account the asynchrony issue of multi-modal fusion in CS. To our knowledge, this is the first work to tackle the asynchronous multi-modal fusion in the automatic continuous CS recognition.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Re-Synchronization Using the Hand Preceding Model for Multi-Modal Fusion in Automatic Continuous Cued Speech Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia

Lead the way for us

Journal: IEEE Transactions on Multimedia	Publication Date: Mar 6, 2020
Citations: 56

Similar Papers

A novel resynchronization procedure for hand-lips fusion applied to continuous French Cued Speech recognition
Li Liu ... Gang Feng
-
Li Liu, et. al.Li Liu ... Gang Feng
01 Sep 2019
01 Sep 2019

Cued Speech automatic recognition in normal-hearing and deaf subjects
Panikos Heracleous ... Noureddine Aboutabit
Speech Communication | VOL. 52
Panikos Heracleous, et. al.Panikos Heracleous ... Noureddine Aboutabit
09 Mar 2010
Speech Communication | VOL. 52

An Attention Self-Supervised Contrastive Learning Based Three-Stage Model for Hand Shape Feature Representation in Cued Speech
Jianrong Wang ... Nan Gu
-
Jianrong Wang, et. al.Jianrong Wang ... Nan Gu
30 Aug 2021
30 Aug 2021

Can explicit training in cued speech improve phoneme identification?
R Rees ... C Newton
Deafness & Education International | VOL. 19
R Rees, et. al.R Rees ... C Newton
02 Jan 2017
Deafness & Education International | VOL. 19

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Re-Synchronization Using the Hand Preceding Model for Multi-Modal Fusion in Automatic Continuous Cued Speech Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia