Incorporating large unlabeled data to enhance EM classification

Xintao Wu

doi:10.1007/s10844-006-0865-3

Abstract

This paper investigates the problem of augmenting labeled data with unlabeled data to improve classification accuracy. This is significant for many applications such as image classification where obtaining classification labels is expensive, while large unlabeled examples are easily available. We investigate an Expectation Maximization (EM) algorithm for learning from labeled and unlabeled data. The reason why unlabeled data boosts learning accuracy is because it provides the information about the joint probability distribution. A theoretical argument shows that the more unlabeled examples are combined in learning, the more accurate the result. We then introduce B-EM algorithm, based on the combination of EM with bootstrap method, to exploit the large unlabeled data while avoiding prohibitive I/O cost. Experimental results over both synthetic and real data sets show that the proposed approach has a satisfactory performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Incorporating large unlabeled data to enhance EM classification

Abstract

Talk to us

Similar Papers

More From: Journal of Intelligent Information Systems

Lead the way for us

Journal: Journal of Intelligent Information Systems	Publication Date: May 1, 2006
Citations: 22

Similar Papers

B-EM
Xintao Wu ... Kalpathi R Subramanian
-
Xintao Wu, et. al.Xintao Wu ... Kalpathi R Subramanian
23 Jul 2002
23 Jul 2002

Semi-supervised Text Classification Using Partitioned EM
Gao Cong ... Bing Liu
-
Gao Cong, et. al.Gao Cong ... Bing Liu
01 Jan 2004
01 Jan 2004

Positive and Unlabeled Examples Help Learning
Francesco De Comité ... Fabien Letouzey
-
Francesco De Comité, et. al.Francesco De Comité ... Fabien Letouzey
01 Jan 1998
01 Jan 1998

Semi-Supervised Learning
Tobias Scheffer
-
Tobias SchefferTobias Scheffer
01 Jan 2009
01 Jan 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Incorporating large unlabeled data to enhance EM classification

Abstract

Talk to us

Similar Papers

More From: Journal of Intelligent Information Systems