Abstract

Techniques for performing audio-visual speech recognition, with improved recognition performance, in a degraded visual environment. For example, in one aspect of the invention, a technique for use in accordance with an audio-visual speech recognition system for improving a recognition performance thereof includes the steps/operations of: (i) selecting between an acoustic-only data model and an acoustic-visual data model based on a condition associated with a visual environment; and (ii) decoding at least a portion of an input spoken utterance using the selected data model. Advantageously, during periods of degraded visual conditions, the audio-visual speech recognition system is able to decode (recognize) input speech data using audio-only data, thus avoiding recognition inaccuracies that may result from performing speech recognition based on acoustic-visual data models and degraded visual data.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.