Abstract
Current progress in the development of automatic speech recognition (ASR) systems may soon permit discrete symbolic speechreading supplements to be derived from the speech signal. Such supplements could be similar to those used in manual cued speech, in which the talker uses discrete hand positions and shapes to provide distinctions between constants and vowels that are often confused in speechreading. Highly trained receivers of manual cued speech can achieve nearly perfect reception of everyday connected speech materials at normal speaking rates through the visual sense alone. To understand the accuracy that might be achieved with automatically generated cues, we measured how well trained spectrogram readers and an automatic speech recognizer could assign cues for various cue systems. A model of audiovisual integration was then applied to these measurements and data on human recognition of consonant and vowel segments via speechreading was published. This analysis suggests that with cues derived from current recognizers, consonant and vowel segments can be received with accuracies in excess of 80%, roughly equivalent to the segment reception accuracy required to account for observed levels of manual cued speech reception. To provide guidance for the development of automatic cueing systems, we describe techniques for determining optimum cue groups for a given recognizer and speechreader, and estimate the cueing performance that might be achieved if the performance of current recognizers were improved.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.