We address the sequence classification problem using a probabilistic model based on hidden Markov models (HMMs). In contrast to commonly-used likelihood-based learning methods such as the joint/conditional maximum likelihood estimator, we introduce a discriminative learning algorithm that focuses on class margin maximization. Our approach has two main advantages: (i) As an extension of support vector machines (SVMs) to sequential, non-Euclidean data, the approach inherits benefits of margin-based classifiers, such as the provable generalization error bounds. (ii) Unlike many algorithms based on non-parametric estimation of similarity measures that enforce weak constraints on the data domain, our approach utilizes the HMM's latent Markov structure to regularize the model in the high-dimensional sequence space. We demonstrate significant improvements in classification performance of the proposed method in an extensive set of evaluations on time-series sequence data that frequently appear in data mining and computer vision domains.
Read full abstract