Abstract

Graph-based unsupervised feature selection methods have successfully processed high-dimensional data since they can effectively preserve data structure information. However, most existing methods are facing two challenges: (1) it is unreliable to exploit the intrinsic spatial structure from raw data because of the existence of redundant information and noise; (2) they only consider the first-order similar information but ignore the high-order similar information (i.e., samples with similar neighborhood network structure should be similar to each other). This study proposes an unsupervised feature selection with high-order similarity learning (HSL) to tackle the above problems. The projection matrix, first-order similar information, and high-order similar information are simultaneously learned in a unified framework, such that the intrinsic structure information can be exploited in the clean data that come from a latent low-dimensional embedding subspace. Specifically, we project raw data into a low-dimensional embedding subspace to effectively eliminate redundant information and noise. Moreover, we simultaneously consider the first-order similarity and the high-order similarity in the low-dimensional embedding space so that a better graph Laplacian matrix can be learned to fully preserve the intrinsic structure information of raw data and simultaneously utilize the label information. Furthermore, we design an effective optimization algorithm to tackle the proposed HSL. Comprehensive experiments on real-world datasets verify the superiority and effectiveness of HSL.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call