Semi-supervised Dimension Reduction Using Graph-Based Discriminant Analysis

Gaksoo Lim,Cheong Hee Park

doi:10.1109/cit.2009.64

Abstract

Semi-supervised learning aims to utilize unlabeled data in the process of supervised learning. In particular, combining semi-supervised learning with dimension reduction can reduce overfitting caused by small sample size in high dimensional data. By graph representation with similarity edge weights among data samples including both labeled and unlabeled data, statistical and geometric-structures in data are utilized to explore clustering structure of a small number of labeled data samples. However, most of semi-supervised dimension reduction methods use the information induced from unlabeled data points to modify only within-class scatter of labeled data, since unlabeled data can not give any information about distance between classes. In this paper, we propose semi-supervised dimension reduction which reinforce-between-class distance by using a penalty graph and minimize within-class scatter by using a similarity graph. We apply our approach to extend linear dimension reduction methods such as linear discriminant analysis (LDA) and maximum margin criterion (MMC) and demonstrate that modifying between-class distance as well can make great impacts on classification performance.

Full Text