Unsupervised feature selection by non-convex regularized self-representation

Jianyu Miao,Yuan Ping,Zhensong Chen,Xiao-Bo Jin,Peijia Li,Lingfeng Niu

doi:10.1016/j.eswa.2021.114643

Abstract

Feature selection, as a crucial pre-processing stage in expert and intelligent systems, aims at reducing the dimensionality of the high-dimensional data by selecting the optimal subset from original features set. It can enhance the interpretability, improve learning performance, and increase computational efficiency. In real-world applications, obtaining class labels of data is time consuming and labor intensive, thus unsupervised feature selection is more practically important but correspondingly more challenging. Self-representation learning provides some insights on unsupervised feature selection, whose goal is to identify a representative feature subset so that all the features can be well reconstructed by them. In this paper, we propose a new unsupervised feature selection method by using NOn-conVex Regularized Self-Representation (NOVRSR). Different from most prior researches resorting to pseudo labels of data, NOVRSR exploits importance and relevance of features by self-representation. Moreover, the ℓ2,1-2 sparse regularization, which is non-convex yet Lipschitz continuous, is enforced on the representation coefficient matrix to perform feature selection. We show in theory that the utilization of ℓ2,1-2 can guarantee the sparsity of the representation coefficient matrix. In addition, to find the solution of the resulting non-convex formula, we design an iterative algorithm in the framework of ConCave-Convex Procedure (CCCP) and prove that the iterative sequence converges to the stationary point satisfying the first-order optimality condition. An adopted Alternating Direction Method of Multipliers (ADMM) is embedded to solve the sequence of convex subproblems of CCCP efficiently. Extensive experimental studies on real-world datasets demonstrate that the effectiveness of the proposed method.

Full Text