Audiovisual automatic speech recognition: Progress and challenges

Gerasimos Potamianos

doi:10.1121/1.2936018

Abstract

The paper overviews recent progress and challenges in a number of audiovisual speech processing technologies with main emphasis on the problem of automatic speech recognition. It is well known that visual channel information can improve automatic speech processing for human-computer interaction. To automatically process and incorporate such information into automatic systems, a number of steps are required that are surprisingly similar accross speech technologies. Crucial above all is the issue of feature representation of visual speech and its robust extraction. In addition, appropriate integration of the audio and visual representations is required, in order to ensure improved performance of the bimodal systems over audio-only baselines. These topics are discussed in detail in the talk, with main emphasis on their application to the speech recognition problem in the challenging environments of automobiles and smart rooms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Audiovisual automatic speech recognition: Progress and challenges

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America

Lead the way for us

Journal: The Journal of the Acoustical Society of America	Publication Date: May 1, 2008
Citations: 3

Similar Papers

Audio-Visual Speech Recognition Using MPEG-4 Compliant Visual Features
Petar S Aleksic ... Zhilin Wu
EURASIP Journal on Advances in Signal Processing | VOL. 2002
Petar S Aleksic, et. al.Petar S Aleksic ... Zhilin Wu
28 Nov 2002
EURASIP Journal on Advances in Signal Processing | VOL. 2002

Towards practical deployment of audio-visual speech recognition
G Potamianos ... S Chu
-
G Potamianos, et. al.G Potamianos ... S Chu
17 May 2004
17 May 2004

Audio and visual modality combination in speech processing applications
Gerasimos Potamianos ... Youssef Mroueh
-
Gerasimos Potamianos, et. al.Gerasimos Potamianos ... Youssef Mroueh
24 Apr 2017
24 Apr 2017

Review of Various Machine Learning and Deep Learning Techniques for Audio Visual Automatic Speech Recognition
Arpita Choudhury ... Pinki Roy
-
Arpita Choudhury, et. al.Arpita Choudhury ... Pinki Roy
03 Feb 2023
03 Feb 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Audiovisual automatic speech recognition: Progress and challenges

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America