Abstract

The insufficiency of labeled training data for representing the distribution of entire dataset is a major obstacle in various practical data mining applications. Semi-supervised learning algorithms, which attempt to learn from both labeled and unlabeled data, provide possibilities to solve this problem. Graph-based semi-supervised learning has recently become one of the most active research areas. In this paper, a novel graph-based semi-supervised learning approach entitled Class Dissimilarity based Linear Neighborhood Propagation (CD-LNP) is proposed, which assumes that each data point can be linearly reconstructed from its neighborhood. The neighborhood graph of the input data is constructed according to a certain kind of dissimilarity between data points, which is specially designed to integrate the class information. Our algorithm can propagate the labels from the labeled points to entire data set using these linear neighborhoods with sufficient smoothness. Experiment results demonstrate that our approach outperforms other popular graph-based semi-supervised learning methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call