Discriminative sparse subspace learning and its application to unsupervised feature selection
Discriminative sparse subspace learning and its application to unsupervised feature selection
168
- 10.1016/j.patcog.2014.08.004
- Aug 15, 2014
- Pattern Recognition
65
- 10.1007/978-1-4471-6714-3
- Jan 1, 2015
301
- 10.1016/j.patcog.2014.08.006
- Aug 19, 2014
- Pattern Recognition
1168
- 10.1137/120887795
- Jan 1, 2013
- SIAM Journal on Imaging Sciences
337
- 10.1109/tip.2012.2205006
- Jun 18, 2012
- IEEE Transactions on Image Processing
23
- 10.1016/j.jvcir.2013.11.002
- Nov 16, 2013
- Journal of Visual Communication and Image Representation
153
- 10.1109/43.103500
- Jan 1, 1991
- IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
11513
- 10.1137/080716542
- Jan 1, 2009
- SIAM Journal on Imaging Sciences
61
- 10.1016/j.isatra.2014.06.008
- Jul 9, 2014
- ISA Transactions
207
- 10.1090/chel/367
- Aug 18, 2009
- Research Article
4
- 10.1109/access.2018.2888924
- Jan 1, 2019
- IEEE Access
In this paper, we propose a novel feature selection model based on subspace learning with the use of a large margin principle. First, we present a new margin metric described by a given instance and its nearest missing and nearest hit, which can be explained as the nearest neighbor with a different label and the same label, respectively. Specifically, for a given instance, the margin is the ratio of the distance of the nearest missing to that of the nearest hit rather than the difference of distances, which contributes to better balance since the distance to the nearest missing is usually much larger than the nearest hit. The proposed model seeks a subspace in which the margin metric is maximized. Moreover, considering that the nearest neighbors of a given sample are uncertain in the presence of many irrelevant features, we treat them as hidden variables and estimate the expectation of margin. To perform the feature selection, an $\ell _{2,1}$ -norm is imposed on the subspace projection matrix to enforce row sparsity. The resulting trace ratio optimization problem, which can be connected to a nonlinear eigenvalue problem, is hard to solve. Thus, we design an efficient iterative algorithm and present a theoretical analysis of the convergence. Finally, we evaluate the proposed method by comparing it against several other state-of-the-art methods. The extensive experiments on real-world datasets show the superiority of our proposed approach.
- Research Article
18
- 10.1109/tcsvt.2020.2989659
- Apr 24, 2020
- IEEE Transactions on Circuits and Systems for Video Technology
Due to the effectiveness in learning the subspace structures, low-rank representation (LRR) and its variations have been widely applied in various fields, such as computer vision and pattern recognition. However, in real applications, it is a challenge to handle the complex noises. To address this problem, we propose a novel robust LRR method based on kernel risk-sensitive loss (KRSL) with high-order manifold constraint, called RHLRR, in which the KRSL is introduced to deal with the noises and the multiple hypergraph regularization term is used as a high order manifold constraint to effectively capture the locality, similarity and the intrinsic geometric information in data. Besides, an iterative algorithm based on the half-quadratic (HQ) and the accelerated block coordinate update (BCU) is developed. The experimental results demonstrate that the proposed method can outperform other state-of-the-art LRR variants.
- Research Article
- 10.1145/3767726
- Sep 13, 2025
- ACM Computing Surveys
Dimensionality Reduction plays a pivotal role in improving feature learning accuracy and reducing training time by eliminating redundant features, noise, and irrelevant data. Nonnegative Matrix Factorization (NMF) has emerged as a popular and powerful method for dimensionality reduction. Despite its extensive use, there remains a need for a comprehensive analysis of NMF in the context of dimensionality reduction. To bridge this gap, this paper presents a comprehensive survey of NMF, focusing on its applications in both feature extraction and feature selection. We propose a novel classification scheme for dimensionality reduction to enhance understanding of its core principles. Subsequently, we delve into a thorough summary of diverse NMF approaches used for feature extraction and selection. Furthermore, we discuss the latest research trends and potential future directions for leveraging NMF in dimensionality reduction, aiming to highlight areas that need further exploration and development.
- Research Article
7
- 10.1109/access.2020.3010862
- Jan 1, 2020
- IEEE Access
In this paper, we present a novel Local Sensitive Dual Concept Learning (LSDCL) method for the task of unsupervised feature selection. We first reconstruct the original data matrix by the proposed dual concept learning model, which inherits the merit of co-clustering based dual learning mechanism for more interpretable and compact data reconstruction. We then adopt the local sensitive loss function, which emphasizes more on most similar pairs with small errors to better characterize the local structure of data. In this way, our method can select features with better clustering results by more compact data reconstruction and more faithful local structure preserving. An iterative algorithm with convergence guarantee is also developed to find the optimal solution. We fully investigate the performance improvement by the newly developed terms, individually and simultaneously. Extensive experiments on benchmark datasets further show that LSDCL outperforms many state-of-the-art unsupervised feature selection algorithms.
- Research Article
278
- 10.1109/comst.2018.2883147
- Jan 1, 2019
- IEEE Communications Surveys & Tutorials
Traffic analysis is a compound of strategies intended to find relationships, patterns, anomalies, and misconfigurations, among others things, in Internet traffic. In particular, traffic classification is a subgroup of strategies in this field that aims at identifying the application’s name or type of Internet traffic. Nowadays, traffic classification has become a challenging task due to the rise of new technologies, such as traffic encryption and encapsulation, which decrease the performance of classical traffic classification strategies. Machine learning (ML) gains interest as a new direction in this field, showing signs of future success, such as knowledge extraction from encrypted traffic, and more accurate Quality of Service management. ML is fast becoming a key tool to build traffic classification solutions in real network traffic scenarios; in this sense, the purpose of this investigation is to explore the elements that allow this technique to work in the traffic classification field. Therefore, a systematic review is introduced based on the steps to achieve traffic classification by using ML techniques. The main aim is to understand and to identify the procedures followed by the existing works to achieve their goals. As a result, this survey paper finds a set of trends derived from the analysis performed on this domain; in this manner, the authors expect to outline future directions for ML-based traffic classification.
- Research Article
4
- 10.1109/access.2020.3024690
- Jan 1, 2020
- IEEE Access
Feature selection and instance selection are dual operations on a data matrix. Feature selection aims at selecting a subset of relevant and informative features from original feature space, while instance selection at identifying a subset of informative and representative instances. Most of previous studies address these two problems separately, such that irrelevant features (resp. outliers) may mislead the process of instance (resp. feature) selection. In this paper, we address the problem by doing feature and instance selection simultaneously. We propose a novel unified framework, which chooses instances and features simultaneously, such that 1)all the data can be reconstructed from the selected instances and features and 2) the global structure which is characterized by the sparse reconstruction coefficient is preserved. Experimental results on several benchmark data sets demonstrate the effectiveness of our proposed method.
- Research Article
14
- 10.1109/tcsvt.2018.2856827
- Jul 1, 2019
- IEEE Transactions on Circuits and Systems for Video Technology
In order to efficiently utilize the information in the data and eliminate the negative effects of outliers in the principal component analysis (PCA) method, in this paper, we propose a novel robust sparse PCA method based on maximum correntropy criterion (MCC) with high-order manifold constraints called the RHSPCA. Compared with the traditional PCA methods, the proposed RHSPCA has the following benefits: 1) the MCC regression term is more robust to outliers than the MSE-based regression term; 2) thanks to the high-order manifold constraints, the low-dimensional representations can preserve the local relations of the data and greatly improve the clustering and classification performance for image processing tasks; and 3) in order to further counteract the adverse effects of outliers, the MCC-based samples’ mean is proposed to better centralize the data. We also propose a new solver based on the half-quadratic technique and accelerated block coordinate update strategy to solve the RHSPCA model. Extensive experimental results show that the proposed method can outperform the state-of-the-art robust PCA methods on a variety of image processing tasks, including reconstruction, clustering, and classification, on outliers contaminated datasets.
- Research Article
53
- 10.1016/j.patcog.2019.04.020
- Apr 25, 2019
- Pattern Recognition
Nonnegative Laplacian embedding guided subspace learning for unsupervised feature selection
- Research Article
25
- 10.1007/s13042-019-01046-w
- Dec 20, 2019
- International Journal of Machine Learning and Cybernetics
Most of existing research works in the field of feature selection via matrix factorization techniques have been employed for unsupervised learning problems. This paper introduces a new framework for the supervised feature selection, called supervised feature selection by constituting a basis for the original space of features and matrix factorization (SFS-BMF). To this end, SFS-BMF is a guided search to find a basis for the original space of features that inherently contains linearly independent features and can be replaced with the original space. For finding the best subset of features regarding the class attribute, information gain is utilized for the process of constructing a basis. In fact, a basis for the original features is constructed according to the most informative features in terms of the information gain. Then, this basis is decomposed through a matrix factorization form in order to select a subset of features. Our proposed method guarantees the maximum relevancy of selected features to the output by using the information gain while simultaneously secures the minimum redundancy among them based on the linear independence property. Several experiments on high-dimensional microarray datasets are conducted for illustrating the efficiency of SFS-BMF. The experimental results show that the proposed SFS-BMF method outperforms some state-of-the-art feature selection methods with respect to classification performance and also according to the computational complexity.
- Research Article
38
- 10.1109/tcsvt.2017.2783364
- Feb 1, 2019
- IEEE Transactions on Circuits and Systems for Video Technology
High-dimensional data contain not only redundancy but also noises produced by the sensors. These noises are usually non-Gaussian distributed. The metrics based on Euclidean distance are not suitable for these situations in general. In order to select the useful features and combat the adverse effects of the noises simultaneously, a robust sparse subspace learning method in unsupervised scenario is proposed in this paper based on the maximum correntropy criterion that shows strong robustness against outliers. Furthermore, an iterative strategy based on half quadratic and an accelerated block coordinate update is proposed. The convergence analysis of the proposed method is also carried out to ensure the convergence to a reliable solution. Extensive experiments are conducted on real-world data sets to show that the new method can filter out the outliers and outperform several state-of-the-art unsupervised feature selection methods.
- Research Article
6
- 10.1016/j.eswa.2024.123831
- Mar 26, 2024
- Expert Systems with Applications
Discriminative sparse subspace learning with manifold regularization
- Book Chapter
2
- 10.1007/978-3-319-60176-2_4
- Jan 1, 2017
Subspace learning is widely used in extracting discriminative features for classification. However, when data are contaminated with severe noise, the performance of most existing subspace learning methods would be limited. Recent advances in low-rank modeling provide effective solutions for removing noise or outliers contained in sample sets, which motivates us to take advantages of low-rank constraints in order to exploit robust and discriminative subspace for classification. In this chapter, we introduce a discriminative subspace learning method named Supervised Regularization based Robust Subspace (SRRS) approach, by incorporating the low-rank constraint. SRRS seeks low-rank representations from the noisy data, and learns a discriminative subspace from the recovered clean data jointly. A supervised regularization function is designed to make use of the class label information and therefore to enhance the discriminability of subspace. Our approach is formulated as a constrained rank minimization problem. We design an inexact augmented Lagrange multiplier (ALM) optimization algorithm to solve it. Unlike the existing sparse representation and low-rank learning methods, our approach learns a low-dimensional subspace from recovered data, and explicitly incorporates the supervised information. Our approach and some baselines are evaluated on the COIL-100, ALOI, Extended YaleB, FERET, AR, and KinFace databases. Experimental results demonstrate the effectiveness of our approach, especially when the data contain considerable noise or variations.
- Research Article
2
- 10.1142/s0218001419510066
- Sep 1, 2019
- International Journal of Pattern Recognition and Artificial Intelligence
Subspace learning has been widely utilized to extract discriminative features for classification task, such as face recognition, even when facial images are occluded or corrupted. However, the performance of most existing methods would be degraded significantly in the scenario of that data being contaminated with severe noise, especially when the magnitude of the gross corruption can be arbitrarily large. To this end, in this paper, a novel discriminative subspace learning method is proposed based on the well-known low-rank representation (LRR). Specifically, a discriminant low-rank representation and the projecting subspace are learned simultaneously, in a supervised way. To avoid the deviation from the original solution by using some relaxation, we adopt the Schatten [Formula: see text]-norm and [Formula: see text]-norm, instead of the nuclear norm and [Formula: see text]-norm, respectively. Experimental results on two famous databases, i.e. PIE and ORL, demonstrate that the proposed method achieves better classification scores than the state-of-the-art approaches.
- Research Article
5
- 10.1016/j.jksuci.2023.101648
- Jul 10, 2023
- Journal of King Saud University - Computer and Information Sciences
Graph adaptive semi-supervised discriminative subspace learning for EEG emotion recognition
- Research Article
- 10.1093/comjnl/bxae049
- Jun 10, 2024
- The Computer Journal
Many subspace learning methods based on low-rank representation employ the nearest neighborhood graph to preserve the local structure. However, in these methods, the nearest neighborhood graph is a binary matrix, which fails to precisely capture the similarity between distinct samples. Additionally, these methods need to manually select an appropriate number of neighbors, and they cannot adaptively update the similarity graph during projection learning. To tackle these issues, we introduce Discriminative Subspace Learning with Adaptive Graph Regularization (DSL_AGR), an innovative unsupervised subspace learning method that integrates low-rank representation, adaptive graph learning and nonnegative representation into a framework. DSL_AGR introduces a low-rank constraint to capture the global structure of the data and extract more discriminative information. Furthermore, a novel graph regularization term in DSL_AGR is guided by nonnegative representations to enhance the capability of capturing the local structure. Since closed-form solutions for the proposed method are not easily obtained, we devise an iterative optimization algorithm for its resolution. We also analyze the computational complexity and convergence of DSL_AGR. Extensive experiments on real-world datasets demonstrate that the proposed method achieves competitive performance compared with other state-of-the-art methods.
- Research Article
7
- 10.1109/embc.2014.6944447
- Aug 1, 2014
- Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference
Spike sorting is a fundamental preprocessing step for many neuroscience studies which rely on the analysis of spike trains. In this paper, we present two unsupervised spike sorting algorithms based on discriminative subspace learning. The first algorithm simultaneously learns the discriminative feature subspace and performs clustering. It uses histogram of features in the most discriminative projection to detect the number of neurons. The second algorithm performs hierarchical divisive clustering that learns a discriminative 1-dimensional subspace for clustering in each level of the hierarchy until achieving almost unimodal distribution in the subspace. The algorithms are tested on synthetic and in-vivo data, and are compared against two widely used spike sorting methods. The comparative results demonstrate that our spike sorting methods can achieve substantially higher accuracy in lower dimensional feature space, and they are highly robust to noise. Moreover, they provide significantly better cluster separability in the learned subspace than in the subspace obtained by principal component analysis or wavelet transform.
- Research Article
126
- 10.1109/tnnls.2015.2464090
- Aug 31, 2015
- IEEE Transactions on Neural Networks and Learning Systems
In this paper, we aim at learning robust and discriminative subspaces from noisy data. Subspace learning is widely used in extracting discriminative features for classification. However, when data are contaminated with severe noise, the performance of most existing subspace learning methods would be limited. Recent advances in low-rank modeling provide effective solutions for removing noise or outliers contained in sample sets, which motivates us to take advantage of low-rank constraints in order to exploit robust and discriminative subspace for classification. In particular, we present a discriminative subspace learning method called the supervised regularization-based robust subspace (SRRS) approach, by incorporating the low-rank constraint. SRRS seeks low-rank representations from the noisy data, and learns a discriminative subspace from the recovered clean data jointly. A supervised regularization function is designed to make use of the class label information, and therefore to enhance the discriminability of subspace. Our approach is formulated as a constrained rank-minimization problem. We design an inexact augmented Lagrange multiplier optimization algorithm to solve it. Unlike the existing sparse representation and low-rank learning methods, our approach learns a low-dimensional subspace from recovered data, and explicitly incorporates the supervised information. Our approach and some baselines are evaluated on the COIL-100, ALOI, Extended YaleB, FERET, AR, and KinFace databases. The experimental results demonstrate the effectiveness of our approach, especially when the data contain considerable noise or variations.
- Research Article
- 10.1155/2020/8872348
- Nov 4, 2020
- Complexity
Recently, cross-view feature learning has been a hot topic in machine learning due to the wide applications of multiview data. Nevertheless, the distribution discrepancy between cross-views leads to the fact that instances of the different views from same class are farther than those within the same view but from different classes. To address this problem, in this paper, we develop a novel cross-view discriminative feature subspace learning method inspired by layered visual perception from human. Firstly, the proposed method utilizes a separable low-rank self-representation model to disentangle the class and view structure layers, respectively. Secondly, a local alignment is constructed with two designed graphs to guide the subspace decomposition in a pairwise way. Finally, the global discriminative constraint on distribution center in each view is designed for further alignment improvement. Extensive cross-view classification experiments on several public datasets prove that our proposed method is more effective than other existing feature learning methods.
- Research Article
13
- 10.1109/tcsvt.2019.2918591
- May 29, 2019
- IEEE Transactions on Circuits and Systems for Video Technology
Although deep features have achieved the state-of-the-art performance in action recognition recently, the hand-crafted shallow features still play a critical role in characterizing human actions for taking advantage of visual contents in an intuitive way such as edge features. Therefore, the shallow features can serve as auxiliary visual cues supplementary to deep representations. In this paper, we propose a discriminative subspace learning model (DSLM) to explore the complementary properties between the hand-crafted shallow feature representations and the deep features. As for the RGB action recognition, this is the first work attempting to mine multi-level feature complementaries by the multi-view subspace learning scheme. To sufficiently capture the complementary information among heterogeneous features, we construct the DSLM by integrating the multi-view reconstruction error and classification error into an unified objective function. To be specific, we first use Fisher Vector to encode improved dense trajectories (iDT+FV) for shallow representations and two-stream convolutional neural network models (T-CNN) for generating deep features. Moreover, the presented DSLM algorithm projects multi-level features onto a shared discriminative subspace with the complementary information and discriminating capacity simultaneously incorporated. Finally, the action types of test samples are identified by the margins from the learned compact representations to the decision boundary. The experimental results on three datasets demonstrate the effectiveness of the proposed method.
- Research Article
1
- 10.1007/s00500-022-07333-z
- Jul 14, 2022
- Soft Computing
Human age estimation from facial images has become an active research topic in computer vision field because of various real-world applications. Temporal property of facial aging display sequential patterns that lie on the low-dimensional aging manifold. In this paper, we propose hidden factor analysis (HFA) model-based discriminative manifold learning method for age estimation. The hidden factor analysis decomposes facial features into independent age factor and identity factor. Various age invariant face recognition systems in the literature utilize identity factor for face recognition; however, the age factor remains unutilized. The age component of the hidden factor analysis model depends on the subject’s age. Thus it carries significant age-related information. In this paper, we demonstrate that such aging patterns can be effectively extracted from the HFA-based discriminant subspace learning algorithm. Next, we have applied multiple regression methods on low-dimensional aging features learned from the HFA model. Effect of reduced dimensionality on the accuracy has been evaluated by extensive experiments and compared with the state-of-the-art methods. Effectiveness and robustness in terms of MAE and CS of the proposed framework are demonstrated using experimental analysis on a large-scale aging database MORPH II. The accuracy of our method is found superior to the current state-of-the-art methods.
- Research Article
3
- 10.1049/iet-bmt.2019.0104
- Mar 5, 2020
- IET Biometrics
Considering human ageing has a big impact on cross-age face recognition, and the effect of ageing on face recognition in non-ideal images has not been well addressed yet. In this study, the authors propose a discriminative common feature subspace learning method to deal with the problem. Specifically, they consider the samples of the same individual with big age gaps have different distributions in the original space, and employ the maximum mean discrepancy as the distance measure to compute the distances between the sample means of the different distributions. Then the distance measure is integrated into Fisher criterion to learn a discriminative common feature subspace. The aim is to map the images with different ages to the common subspace, and to construct new feature representation which is robust to age variations and discriminative to different subjects. To evaluate the performance of the proposed method on cross-age face recognition, the authors construct extensive experiments on CACD and FG-Net databases. Experimental results show that the proposed method outperforms other subspace based methods and state-of-art cross-age face recognition methods.
- Research Article
9
- 10.1016/j.eswa.2021.116359
- Dec 11, 2021
- Expert Systems with Applications
Graph-based adaptive and discriminative subspace learning for face image clustering
- Research Article
98
- 10.1109/tip.2007.914203
- Feb 1, 2008
- IEEE Transactions on Image Processing
Images, as high-dimensional data, usually embody large variabilities. To classify images for versatile applications, an effective algorithm is necessarily designed by systematically considering the data structure, similarity metric, discriminant subspace, and classifier. In this paper, we provide evidence that, besides the Fisher criterion, graph embedding, and tensorization used in many existing methods, the correlation-based similarity metric embodied in supervised multilinear discriminant subspace learning can additionally improve the classification performance. In particular, a novel discriminant subspace learning algorithm, called correlation tensor analysis (CTA), is designed to incorporate both graph-embedded correlational mapping and discriminant analysis in a Fisher type of learning manner. The correlation metric can estimate intrinsic angles and distances for the locally isometric embedding, which can deal with the case when Euclidean metric is incapable of capturing the intrinsic similarities between data points. CTA learns multiple interrelated subspaces to obtain a low-dimensional data representation reflecting both class label information and intrinsic geometric structure of the data distribution. Extensive comparisons with most popular subspace learning methods on face recognition evaluation demonstrate the effectiveness and superiority of CTA. Parameter analysis also reveals its robustness.
- Components
- 10.1109/tcsvt.2021.3135316/mm1
- Dec 14, 2021
Latent subspace learning aims to produce a latent representation for better reconstruction and classification from high-dimensional data through exploiting the optimal subspace. Current latent subspace learning methods commonly have three problems: 1) The discriminative property is ignored when learning the latent subspace, 2) The redundancy exists between the latent subspace and the prediction space, 3) There is no unified latent subspace that exploits knowledge jointly from the raw space, latent subspace, and label space. In this paper, we formulate the Joint Discriminative Latent Subspace Learning (JDLSL) problem to address these issues, and provide its optimization solution. JDLSL learns image representation from two aspects: a) the joint learning of latent subspaces for data reconstruction and prediction, b) the joint learning of label space and latent subspace for data reconstruction. To integrate knowledge from the joint learning, we organize the sparsity-induced latent subspace, where row-sparsity and column sparsity are simultaneously imposed. We provide the theoretical proof for the discriminativity learning ability of the sparsity-induced latent subspace. Extensive experiments and comparisons with the state-of-the-art showed that the proposed method has better performance. JDLSL shows a competitive performance with deep features compared to deep learning architectures, reflecting it potential integrating with deep learning.
- Research Article
4
- 10.1016/j.sigpro.2012.04.018
- May 10, 2012
- Signal Processing
Discriminative codebook learning for Web image search
- Research Article
- 10.1016/j.isatra.2025.10.051
- Nov 1, 2025
- ISA Transactions
- Research Article
- 10.1016/j.isatra.2025.10.028
- Nov 1, 2025
- ISA transactions
- Research Article
- 10.1016/j.isatra.2025.10.043
- Nov 1, 2025
- ISA Transactions
- Research Article
- 10.1016/j.isatra.2025.10.017
- Nov 1, 2025
- ISA Transactions
- Research Article
- 10.1016/j.isatra.2025.10.053
- Nov 1, 2025
- ISA Transactions
- Research Article
- 10.1016/j.isatra.2025.07.034
- Nov 1, 2025
- ISA transactions
- Research Article
- 10.1016/j.isatra.2025.10.052
- Nov 1, 2025
- ISA transactions
- Research Article
- 10.1016/j.isatra.2025.10.045
- Nov 1, 2025
- ISA Transactions
- Research Article
- 10.1016/j.isatra.2025.10.047
- Oct 31, 2025
- ISA transactions
- Research Article
- 10.1016/j.isatra.2025.10.048
- Oct 31, 2025
- ISA transactions
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.