A Feature Space Learning Model Based on Semi-Supervised Clustering

Renchu Guan,Chen Yang,Maurizio Marchese,Xu Wang,Yanchun Liang

doi:10.1109/cse-euc.2017.79

Abstract

Different from representation learning models using deep learning to project original feature space into lower density ones, we propose a feature space learning (FSL) model based on a semi-supervised clustering framework. There are three main contributions in our approach: (1) Inspired by Zipf's law and word bursts, the feature space learning processes not only select trusted unlabeled samples and useful features but also adaptively span new feature spaces (e.g. only 13.4% - 41.3% of the original feature space) and update feature values; (2) Using two classical clustering methods (kmeans and affinity propagation), four feature space learning (FSL) algorithms are proposed based on a semi-supervised setting. (3) FSL can provide a better data understanding and a descriptive learned feature space that facilitates learning without the tough training for deep architectures. The experimental results on the benchmark data sets demonstrate that FSLbased models are better at promoting the learning performance than the classical unsupervised and semi-supervised clustering models, such as k-means, affinity propagation, and semisupervised affinity propagation, e.g. 13.8% higher F-measure than SK-means and 73.6% higher than k-means. With the carefully designed risk control strategy, the FSL model can dynamically disentangle explanatory factors, depress the noise accumulation and semantic shift, construct easy-to-understand feature spaces.

Full Text