Maximum Entropy Models with Inequality Constraints: A Case Study on Text Categorization

Jun’Ichi Kazama,Jun’Ichi Tsujii

doi:10.1007/s10994-005-0911-3

Abstract

Data sparseness or overfitting is a serious problem in natural language processing employing machine learning methods. This is still true even for the maximum entropy (ME) method, whose flexible modeling capability has alleviated data sparseness more successfully than the other probabilistic models in many NLP tasks. Although we usually estimate the model so that it completely satisfies the equality constraints on feature expectations with the ME method, complete satisfaction leads to undesirable overfitting, especially for sparse features, since the constraints derived from a limited amount of training data are always uncertain. To control overfitting in ME estimation, we propose the use of box-type inequality constraints, where equality can be violated up to certain predefined levels that reflect this uncertainty. The derived models, inequality ME models, in effect have regularized estimation with L1 norm penalties of bounded parameters. Most importantly, this regularized estimation enables the model parameters to become sparse. This can be thought of as automatic feature selection, which is expected to improve generalization performance further. We evaluate the inequality ME models on text categorization datasets, and demonstrate their advantages over standard ME estimation, similarly motivated Gaussian MAP estimation of ME models, and support vector machines (SVMs), which are one of the state-of-the-art methods for text categorization.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Maximum Entropy Models with Inequality Constraints: A Case Study on Text Categorization

Abstract

Talk to us

Similar Papers

More From: Machine Learning

Lead the way for us

Journal: Machine Learning	Publication Date: Sep 1, 2005
Citations: 72

Similar Papers

A comparison of algorithms for maximum entropy parameter estimation
Robert Malouf
-
Robert MaloufRobert Malouf
01 Jan 2002
01 Jan 2002

Fast maximum entropy approximation in SPECT using the RBI-MAP algorithm
D.S Lalush ... B.M.W Tsui
-
D.S Lalush, et. al.D.S Lalush ... B.M.W Tsui
08 Nov 1998
08 Nov 1998

Fast maximum entropy approximation in SPECT using the RBI-MAP algorithm.
D.S Lalush ... E.C Frey
IEEE transactions on medical imaging | VOL. 19
D.S Lalush, et. al.D.S Lalush ... E.C Frey
01 Apr 2000
IEEE transactions on medical imaging | VOL. 19

The relationship between maximum entropy and maximum likelihood spectra
J V Pendrel ... D E Smylie
GEOPHYSICS | VOL. 44
J V Pendrel, et. al.J V Pendrel ... D E Smylie
01 Oct 1979
GEOPHYSICS | VOL. 44

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Maximum Entropy Models with Inequality Constraints: A Case Study on Text Categorization

Abstract

Talk to us

Similar Papers

More From: Machine Learning