Abstract

Recently there is an increased interest in using th e visual features for improved speech processing. L ip reading plays a vital role in visual speech processing. In this paper, a new a pproach for lip reading is presented. Visual speech recognition is applied in mobile phone applications, human-computer interacti on and also to recognize the spoken words of hearin g impaired persons. The visual speech video is taken as input for face dete ction module which is used to detect the face regio n. The mouth region is identified based on the face region of interest (RO I). The mouth images are applied for feature extrac tion process. The features are extracted using every 10th coordinate, every 16 th coordinate, 16 point + Discrete Cosine Transform (DCT) method and Lip DCT method. Then, these features are applied as inp uts for recognizing the visual speech using Hidden Markov Model. Out of the different feature extraction methods, the DCT metho d gives the experimental results of better performa nce accuracy. 10 participants were uttered 35 different isolated wor ds. For each word, 20 samples are collected for tra ining and testing the process.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call