Feature selection with multi-class logistic regression

Jingyu Wang,Hongmei Wang,Feiping Nie,Xuelong Li

doi:10.1016/j.neucom.2023.126268

Abstract

Feature selection can help to reduce data redundancy and improve algorithm performance in actual tasks. Most of the embedded feature selection models are constructed based on square loss and hinge loss. However, these models based on the square loss cannot directly evaluate the discriminability of the samples in the feature subspace, and these methods based on the hinge loss are difficult to solve due to their complex objective functions. To deal with these problems, a Feature Selection method with Multi-class Logistic Regression (FSMLR) is proposed in this paper. Firstly, we construct a linear function to measure the difference between the distance from samples to their regression hyperplane and the distance from these samples to regression hyperplanes of other classes, which could be used to strengthen the discriminant property of the embedded model. Then, we design a re-weighting matrix with a ℓ2,0-norm sparse condition as well as a discrete condition, which is used to select features in the subspace. Considering that it is difficult to solve the re-weighting matrix with the discrete and sparse conditions in an optimization problem, we relax these two conditions and present a feature selection model via a re-weighted multi-class logistic regression with the two relaxed constraints. Finally, we add the F-norm regularization in our model to avoid overfitting, and its unconstrained equivalent transformation with ℓ2,p-norm regularization is derived to explore the function of the re-weighting matrix. The gradient descent algorithm could be used to solve the FSMLR. Especially, when the regularization term in the equivalence problem is set to ℓ2,1-norm, the global optimal solution can be obtained. Extensive experiments on multiple public data sets prove that FSMLR outperforms other competitors.

Full Text