Unsupervised margin-based feature selection using linear transformations with neighbor preservation

Chien-Hsing Chen

doi:10.1016/j.neucom.2015.07.089

Abstract

We present a new margin-based feature selection method for unsupervised learning and use the selected salient features to discover interesting clusters using clustering algorithms. A basic characteristic of clustering motivates our method: each data instance usually belongs to the same cluster as its nearest neighbors and to different clusters than its farthest neighbors. Intuitively, the nearest and farthest neighbors would be primarily driven by certain features, which are defined as the salient features. Our method uses a maximum margin criterion to derive these salient features from the separability of the neighbors in the original representation space and of those in another space. Our method thus differs from existing wrapper-based methods in unsupervised learning because other methods usually need to partition data into clusters, which are then used to select features in the feature selection process. The experimental results indicate that our method outperforms several benchmark filter- and wrapper-based methods in tests on benchmark image datasets in terms of selecting features for discovering interesting clusters.

Full Text