Abstract

Modern unsupervised feature selection methods predominantly obtain the cluster structure and pseudo-labels information through spectral clustering. However, the pseudo-labels obtained by spectral clustering are usually mixed between positive and negative. Moreover, the Laplacian matrix in spectral clustering typically affects feature selection. Additionally, spectral clustering does not consider the interconnection information between data. To address these problems, this paper proposes uncorrelated feature selection via sparse latent representation and extended orthogonal least square discriminant analysis (OLSDA), which we term SLREO). Firstly, SLREO retains the interconnection between data by latent representation learning, and preserves the internal information between the data. In order to remove redundant interconnection information, an l2,1-norm constraint is applied to the residual matrix of potential representation learning. Secondly, SLREO obtains non-negative pseudo-labels through orthogonal least square discriminant analysis (OLSDA) of embedded non-negative manifold structure. It not only avoids the appearance of negative pseudo-labels, but also eliminates the effect of the Laplacian matrix on feature selection. The manifold information of the data is also preserved. Furthermore, the matrix of the learned latent representation and OLSDA is used as pseudo-labels information. It not only ensures that the generated pseudo-labels are non-negative, but also makes the pseudo-labels closer to the true class labels. Finally, in order to avoid trivial solutions, an uncorrelated constraint and l2,1-norm constraint are imposed on the feature transformation matrix. These constraints ensure row sparsity of the feature transformation matrix, select low-redundant and discriminative features, and improve the effect of feature selection. Experimental results show that the Clustering Accuracy (ACC) and Normalized Mutual Information (NMI) of SLREO are significantly improved, as compared with six other published algorithms, tested on 11 benchmark datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call