Web page and image semi-supervised classification with heterogeneous information fusion

Youtian Du,Zhongmin Cai,Chang Su,Xiaohong Guan

doi:10.1177/0165551513477818

Abstract

Web data, such as web pages and web images, can be naturally partitioned into multiple heterogeneous attribute sets. Concretely speaking, web pages consist of hyperlink and contents, and web images consist of the textual and visual information. In this paper, we propose a new multi-view semi-supervised learning method, named local co-training, for web page and image classification. Local co-training employs local linear models to represent data points on each view (i.e. one attribute set), and iteratively refines them using unlabelled data with co-training strategy. In each iteration, only a part of local models that we call dominant local models needs to be incrementally updated. The method is thus efficient and fit for the learning of large-scale web data. In addition, we introduce a new measurement based on both the confidence and the disagreement to describe which unlabelled examples are ‘good’ for the enrichment of training sets. Local co-training builds a bridge between two dominant types of semi-supervised methods: graph-based methods and co-training. Experiments on web page and web image datasets demonstrate that local co-training can effectively improve the classification performance by exploiting multiple attribute sets and unlabelled data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Web page and image semi-supervised classification with heterogeneous information fusion

Abstract

Talk to us

Similar Papers

More From: Journal of Information Science

Lead the way for us

Journal: Journal of Information Science	Publication Date: Feb 25, 2013
Citations: 1

Similar Papers

Web Image Semi-supervised Learning Method Based on Heterogeneous Information Fusion
You-Tian Du ... Qian Li
Acta Automatica Sinica | VOL. 38
You-Tian Du, et. al.You-Tian Du ... Qian Li
01 Jan 2012
Acta Automatica Sinica | VOL. 38

Multi-view semi-supervised web image classification via co-graph
Youtian Du ... Xiaohong Guan
Neurocomputing | VOL. 122
Youtian Du, et. al.Youtian Du ... Xiaohong Guan
02 Jul 2013
Neurocomputing | VOL. 122

Image classification for mobile web browsing
Takuya Maekawa ... Takahiro Hara
-
Takuya Maekawa, et. al.Takuya Maekawa ... Takahiro Hara
23 May 2006
23 May 2006

Automatic Web Page Classification System with Improved Accuracy
Chait Hra ... Dr.G.M Lingaraju
Webology | VOL. 18
Chait Hra, et. al.Chait Hra ... Dr.G.M Lingaraju
23 Dec 2021
Webology | VOL. 18

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Web page and image semi-supervised classification with heterogeneous information fusion

Abstract

Talk to us

Similar Papers

More From: Journal of Information Science