Abstract

To recognize speech, handwriting or sign language, many hybrid approaches have been proposed that combine Dynamic Time Warping (DTW) or Hidden Markov Models (HMM) with discriminative classifiers. However, all methods rely directly on the likelihood models of DTW/HMM. We hypothesize that time warping and classification should be separated because of conflicting likelihood modelling demands. To overcome these restrictions, we propose to use Statistical DTW (SDTW) only for time warping, while classifying the warped features with a different method. Two novel statistical classifiers are proposed (CDFD and Q-DFFM), both using a selection of discriminative features (DF), and are shown to outperform HMM and SDTW. However, we have found that combining likelihoods of multiple models in a second classification stage degrades performance of the proposed classifiers, while improving performance with HMM and SDTW. A proof-of-concept experiment, combining DFFM mappings of multiple SDTW models with SDTW likelihoods, shows that also for model-combining, hybrid classification can provide significant improvement over SDTW. Although recognition is mainly based on 3D hand motion features, these results can be expected to generalize to recognition with more detailed measurements such as hand/body pose and facial expression.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.