Semi-supervised Feature Selection Based on Least Square Regression with Redundancy Minimization

Siqi Xu,Hong Shi,Jianhua Dai

doi:10.1109/ijcnn.2018.8489384

Abstract

Feature selection, selects a representative subset of features, which can greatly improve the computational efficiency and the performance of data mining and machine learning tasks. Semi-supervised feature selection methods deal with the data whose labels are not complete. Most least square regression (LSR) based semi-supervised feature selection methods usually learn a projection matrix W, which reveals the relationship between features and labels. The features are ranked according to their corresponding row norm in W. Some sparse norm regularizations are imposed on W for selecting discriminative features. However, similar features usually have similar rankings if there is no constraint about the importance of features. Hence, the selected features by the sparse models based methods might have redundancy. To reduce the redundancy, we propose a regularization to penalize the high-correlation features. Combined with the loss function based on LSR, the manifold regularization and the regularization about the label matrix learned by LSR, a semi-supervised feature selection framework (SFSRM) is presented. The experimental results demonstrate the effectiveness of the proposed approach.

Full Text