Abstract

Semi-supervised feature selection, which exploits both labeled and unlabeled data to select the relevant features, has an important role in many real world applications. Most semi-supervised feature selection methods have been exclusively designed for classification problems. In this paper, a semi-supervised framework based on graph Laplacian and mixed convex and non-convex l2,p-norm (0 < p ≤ 1) regularization is proposed for regression problems. In the proposed framework, a semi-supervised graph Laplacain based scatter matrix constructed for regression problems is used to encode the label information of labeled data and the local structure of both labeled and unlabeled data. To solve the mixed convex and non-convex regularized l2,p-norm framework, a unified iterative algorithm is proposed. The convergence of the proposed unified algorithm is theoretically and experimentally proved. To evaluate the performance of the proposed framework for regression problems, we perform extensive experiments on different computational drug design regression datasets. The results demonstrate the superiority of the proposed framework in comparison with other feature selection methods in selecting the relevant features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call