Abstract

Due to the rapid development of multimedia technology, a large number of unlabelled data with high dimensionality need to be processed. The high dimensionality of data not only increases the computation burden of computer hardware, but also hinders algorithms to obtain optimal performance. Unsupervised feature selection, which is regarded as a means of dimensionality reduction, has been widely recognized as an important and challenging pre-step for many machine learning and data mining tasks. However, we observe that there are at least two issues in previous unsupervised feature selection methods. Firstly, traditional unsupervised feature selection algorithms usually assume that the data instances are identically distributed and there is no dependency between them. However, the data instances are not only associated with high dimensional features but also inherently interconnected with each other. Secondly, the traditional similarity graph used in previous methods can only describe the pair-wise relations of data, but cannot capture the high-order relations, so that the complex structures implied in the data cannot be sufficiently exploited. In this work, we propose a robust unsupervised feature selection method which embeds the latent representation learning into feature selection. Instead of measuring the feature importances in original data space, the feature selection is carried out in the learned latent representation space which is more robust to noises. In order to capture the local manifold geometrical structure of original data in a high-order manner, a hypergraph is adaptively learned and embedded into the resultant model. An efficient alternating algorithm is developed to optimize the problem. Experimental results on eight benchmark data sets demonstrate the effectiveness of the proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call