Abstract

Due to the scarcity of data labels, unsupervised feature selection has received a lot of attention in recent years. While many unsupervised feature selection methods are capable of selecting relevant features, they often fail to comprehensively consider the impact of both local and global information of the data on feature selection, nor can they effectively handle the complex nonlinear relationships commonly found in real-world data. As a result, suboptimal feature subsets are often selected. In this paper, inspired by the Uniform Manifold Approximation and Projection (UMAP) manifold learning technique and the nonlinear sparse learning method based on Feature-Wise Kernelized Lasso, we propose a novel unsupervised feature selection method called Multi-Cluster Unsupervised Nonlinear Feature Selection based on UMAP and block HSIC Lasso (MUNFS). MUNFS greatly improves the representation of high-dimensional data during dimensionality reduction and effectively handles complex nonlinear relationships in such data. Specifically, by capturing the intrinsic topology of the data, MUNFS accurately preserves the local structure of the data while keeping as much of the global structure as possible. Furthermore, the kernel-based Hilbert–Schmidt Independence Criterion (HSIC) may measure the nonlinear dependency between the features and the target variables, while applying the l1 regularization term in feature selection to achieve sparsity. This allows for a more precise assessment of the significance of each feature. Extensive experimental results on five benchmark datasets and eight hyperspectral datasets demonstrate that the MUNFS method performs much better than several other feature selection methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.