Abstract

Deaf people find it difficult to “read speech” from lips since many sounds such as /p, b, m/ or /t, d, n/ look alike on the lips (“confusion groups”). By performing statistical computations on a large corpus of spoken French (95 000 phonemes), based upon the frequencies of phonemes, diphones, and triphones, it is possible to assert the relative importance of the various confusion groups. We use information theory (conditional entropy, redundancy) to evaluate the impact of confusions on the spoken message information contents. Identification rates of nonsense CV and VC syllables are also considered. A classification of confusion groups according to the message degradation they provoke is proposed and is compared to the current state of the art in automatic phoneme recognition. It appears that a research effort is needed to improve automatic recognition of “difficult” phonemes (occlusives and nasals mainly). These results could prove useful in our research to design an automatic device which would provide visual “cues” to help the deaf to disambiguate the confusion groups.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call