Abstract

Feature selection is a challenging and increasing important task in the machine learning and data mining community. According to different learning scenarios such as the availability of supervision information, feature selection can be broadly categorized as supervised, unsupervised and semi-supervised methods. Traditionally, feature selection algorithms are scenario-specific, i.e., different types of feature selection algorithms were proposed and investigated separately for specific learning scenario. In this paper, we aim to develop a unified view of supervised, unsupervised and semi-supervised feature selection methods. Based on the Hilbert-Schmidt independence criterion (HSIC), we first show that supervised, unsupervised and semi-supervised feature selection methods share the same objective. In discussing the connection between the unified framework and HSIC least absolute shrinkage and selection operator (HSIC Lasso), we find that the proposed framework not only has a clear statistical interpretation that minimum redundant features with maximum dependence on the target output variable can be found in terms of the HSIC, but also enables the global optimal solution to be computed efficiently by solving a Lasso optimization problem. Under this unified view, a new unsupervised feature selection algorithm is also proposed and demonstrated with several benchmark examples.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call