A Supervised Feature Selection Method with Active Pairwise Constraints

Walid Atwa

doi:10.2139/ssrn.3389805

Abstract

Feature selection is an important preprocessing step in mining high-dimensional data. It aims to identify the most informative features for a compact and accurate data representation. As typical supervised feature selection methods have better performance than unsupervised methods, where use class labels as supervised information. Besides class labels, there are other forms of supervised information such as pairwise constraints. However, most of existing methods are passive in the sense that the pairwise constraints are provided beforehand and selected randomly. This may lead to the use of constraints that are redundant, unnecessary, or even harmful to the results. To address these problems, we propose a supervised feature selection method based on active pairwise constraints; by selecting the most informative instances and querying their relationship with the neighborhoods. Experimental results on a series of high-dimensional datasets from UCI repository demonstrate the efficacy of the proposed method, compared with several established feature selection methods.

Full Text