Abstract

Subspace distance is an invaluable tool exploited in a wide range of feature selection methods. The power of subspace distance is that it can identify a representative subspace, including a group of features that can efficiently approximate the space of original features. On the other hand, employing intrinsic statistical information of data can play a significant role in a feature selection process. Nevertheless, most of the existing feature selection methods founded on the subspace distance are limited in properly fulfilling this objective. To pursue this void, we propose a framework that takes a subspace distance into account which is called “Variance–Covariance subspace distance”. The approach gains advantages from the correlation of information included in the features of data, thus determines all the feature subsets whose corresponding Variance–Covariance matrix has the minimum norm property. Consequently, a novel, yet efficient unsupervised feature selection framework is introduced based on the Variance–Covariance distance to handle both the dimensionality reduction and subspace learning tasks. The proposed framework has the ability to exclude those features that have the least variance from the original feature set. Moreover, an efficient update algorithm is provided along with its associated convergence analysis to solve the optimization side of the proposed approach. An extensive number of experiments on nine benchmark datasets are also conducted to assess the performance of our method from which the results demonstrate its superiority over a variety of state-of-the-art unsupervised feature selection methods. The source code is available at https://github.com/SaeedKarami/VCSDFS.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call