Abstract
In this paper, a novel discriminative learning method is proposed to estimate generative models for multi-class pattern classification tasks, where a discriminative objective function is formulated with separation margins according to certain discriminative learning criterion, such as large margin estimation (LME). Furthermore, the so-called approximation-maximization (AM) method is proposed to optimize the discriminative objective function w.r.t. parameters of generative models. The AM approach provides a good framework to deal with latent variables in generative models and it is flexible enough to discriminatively learn many rather complicated generative models. In this paper, we are interested in a group of generative models derived from multinomial distributions. Under some minor relaxation conditions, it is shown that the AM-based discriminative learning methods for these generative models result in linear programming (LP) problems that can be solved effectively and efficiently even for rather large-scale models. As a case study, we have studied to learn multinomial mixture models (MMMs) for text document classification based on the large margin criterion. The proposed methods have been evaluated on a standard RCV1 text corpus. Experimental results show that large margin MMMs significantly outperform the conventional MMMs as well as pure discriminative models such as support vector machines (SVM), where over 25 % relative classification error reduction is observed in three independent RCV1 test sets.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have