Abstract

In visual speech recognition, there are two approaches to lip feature extraction, which are based on appearance and shape. The appearance-based approach is usually better, because it provides visual features that not only include lip structure but also the visibility of the teeth and tongue. However, the disadvantage of this approach is that it produces too many features. Integration of double difference and horizontal-vertical image projection, is part of the appearance approach, in addition to using image projection as dimensional reduction. In previous study, the method has succeeded in recognizing 5 daily indonesian words, with data in form of videos recorded inside the room. In this study, we used the same words with 5 additional new words recorded outside the room. MLP (Multi Layer Perceptron) and SVM (Support Vector Machine) are used as classifiers. The word recognition process is evaluated using 10-fold cross validation. The method tested reached 88.92% on classification accuracy and 0.9948 on AUC (Area Under ROC Curve).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call