Abstract
In this paper, we present a novel Local Sensitive Dual Concept Learning (LSDCL) method for the task of unsupervised feature selection. We first reconstruct the original data matrix by the proposed dual concept learning model, which inherits the merit of co-clustering based dual learning mechanism for more interpretable and compact data reconstruction. We then adopt the local sensitive loss function, which emphasizes more on most similar pairs with small errors to better characterize the local structure of data. In this way, our method can select features with better clustering results by more compact data reconstruction and more faithful local structure preserving. An iterative algorithm with convergence guarantee is also developed to find the optimal solution. We fully investigate the performance improvement by the newly developed terms, individually and simultaneously. Extensive experiments on benchmark datasets further show that LSDCL outperforms many state-of-the-art unsupervised feature selection algorithms.
Highlights
With the rapid development of data acquisition technology, the huge amounts of high-dimensional data become ubiquitous in a variety of real world applications
We propose to reconstruct the data matrix via dual concept learning, where the feature-side and sample-side topic are represented by the non-negative linear combination
We introduce the Corr-entropy Induced Metric (CIM) [44] as a generalized metric based on information-theoretic learning (ITL) [45], which can be defined as follows e2
Summary
With the rapid development of data acquisition technology, the huge amounts of high-dimensional data become ubiquitous in a variety of real world applications. As a data preprocessing strategy, have been proven to be effective and efficient to remove these features and only keep a few relevant and informative features, which reduces the storage and computational cost while avoids significant loss of information or degradation of subsequent learning performance These feature selection algorithms can be broadly classified into supervised, semi-supervised and unsupervised methods according to the availability of supervision. H. Zhao et al.: Local Sensitive Dual Concept Factorization for Unsupervised Feature Selection [22]. The mismatch between the often encountered small errors and the loss function could degrade for the performance of unsupervised feature selection algorithms. We propose to reconstruct the data matrix via dual concept learning, where the feature-side and sample-side topic are represented by the non-negative linear combination. In order not to distract from the reading, proofs of the results are moved to Appendix
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have