Abstract

In many real-world unsupervised learning applications, given data with balanced distribution, that is, there are an approximately equal number of instances in each class, we often need to construct a model to reveal such balance. However, in many data, especially the high-dimensional ones, the data in the original feature space often do not present such balance due to the redundant and noisy features. To tackle this problem, we apply an unsupervised spectral feature selection method to select some informative features, which can better reveal the balanced structure of data. Although spectral feature selection is one of the most popular unsupervised feature selection methods and has been widely studied, none of the existing spectral feature selection methods consider the balance property of data. To address this issue, in this article, we propose a novel balanced spectral feature selection (BSFS) method, which not only selects the discriminative features but also picks those to reveal the balanced structure of data. To the best of our knowledge, this is the first spectral feature selection method considering balance structure of data. By introducing a balanced regularization term, we integrate the balanced spectral clustering and feature selection into a unified framework seamlessly. At last, the experiments on benchmark datasets show that the proposed one outperforms the conventional feature selection methods in both clustering performance and balance, which demonstrates the effectiveness and efficiency of the proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call