Abstract

To remove redundant features and avoid the curse of dimensionality, the most important features should be selected for downstream tasks, including semi-supervised learning. Several semi-supervised constraint scores using pairwise constraints have been proposed to estimate the relevance of features. However, these methods evaluate the features individually and ignore the correlations between them. Thus, we propose a semi-supervised feature selection method called the iterative constraint score based on the hypothesis margin (HM-ICS), which uses forward sequential selection to select an optimal feature subset with a good ability to maintain the constraint structure of the data and distinguish samples that belong to different classes. HM-ICS iteratively modifies the classical constraint score method to measure the relevance between features and maintain the constraint structure of the data. By introducing the hypothesis margin, HM-ICS can ensure strong discriminative power of the optimal feature subset. Extensive experiments were conducted on nine UCI and five high-dimensional datasets, and the experimental results confirmed that HM-ICS can achieve better performance than state-of-the-art supervised and semi-supervised methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call