Abstract

It is unclear whether Hidden Markov Model (HMM) or Dynamic Time Warping (DTW) mapping is more appropriate for visual speech recognition when only small data samples are available. In this letter, the two approaches are compared in terms of sensitivity to the amount of training samples and computing time with the objective of determining the tipping point. The limited training data problem is addressed by exploiting a straightforward template matching via weighted-DTW. The proposed framework is a refined DTW by adjusting the warping paths with judicially injected weights to ensure a smooth diagonal path for accurate alignment without added computational load. The proposed WDTW is evaluated on three databases (two in the public domain and one developed in-house) for visual recognition performance. Subsequent experiments indicate that the proposed WDTW significantly enhances the recognition rate compared to the DTW and HMM based algorithms, especially under limited data samples.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.