Abstract

In this paper, we propose a general unsupervised feature selection method named unsupervised feature selection using principal component analysis (UFSPCA). Repetitive information causes redundancy in information and if there is a correlation between features, it is not easy to understand the information is repetitive. Accordingly, we first use PCA to create uncorrelated and orthogonal features, then calculate the similarities between the original and uncorrelated features. Next, we modeled two sets of original and orthogonal features and their similarity between them to a weighted bipartite graph. Finally, we obtained a matching with the maximum weight using the Hungarian algorithm. The vertices of the original features that are in this matching are the selected features. To illustrate the optimality and efficiency of the proposed method, we evaluated the performance of our proposed method on five datasets using the KNN classifier and compared it with seven well-known unsupervised feature selection algorithms. The evaluation results show that the UFSPCA method is superior to the other seven algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call