Abstract

Feature selection (FS) aims to eliminate redundant features and choose the informative ones. Since labeled data are not always easily available and abundant unlabeled data are often accessible, the importance of semi-supervised FS (SSFS), which selects the informative features utilizing the labeled and unlabeled data, becomes apparent. Semi-supervised sparse FS based on graph Laplacian (GL) regularization has become a popular technique. The GL regularization biases the solution towards a constant geodesic function, has a poor extrapolating ability, and cannot preserve well the topological structure of data. Traditional ridge regression is widely utilized by the GL-based semi-supervised sparse FS, but the results cannot gain the closed-form solution and carry out the manifold structure. To tackle these problems, we propose a Hessian-based SSFS framework using the generalized uncorrelated constraint (HSFSGU) and the mixed ℓ2,p-norm (0<p≤1) regularization. We also propose a Hessian–Laplacian-based SSFS framework using the generalized uncorrelated constraint (HLSFSGU) which utilizes the combination of GL and Hessian matrices. Our frameworks use the Hessian regularization for maintaining the topological structure of data well, the ℓ2,p-norm regularization for making the projection matrix appropriate for FS, and the generalized uncorrelated constraint for preventing exceeding row sparsity of the projection matrix and providing the elegant closed-form solution for our frameworks. We also present a unified algorithm to solve the frameworks for convex and non-convex cases, and prove its convergence. The results of the experiments depict the ability of the Hessian-based frameworks under generalized uncorrelated constraint in selecting the informative features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call