Abstract
Discriminative subspace clustering (DSC) combines Linear Discriminant Analysis (LDA) with clustering algorithm, such as K-means (KM), to form a single framework to perform dimension reduction and clustering simultaneously. It has been verified to be effective for high-dimensional data. However, most existing DSC algorithms rigidly use the Frobenius norm (F-norm) to define model that may not always suitable for the given data. In this paper, DSC is extended in the sense of $l_{2, p}$ -norm, which is a general form of the F-norm, to obtain a family of DSC algorithms which provide more alternative models for practical applications. In order to achieve this goal. Firstly, an efficient algorithm for the $l_{p}$ -norm based KM (KM $_{\mathrm {p}}$ ) clustering is proposed. Then, based on the equivalence of LDA and linear regression, a $l_{\mathrm {2,}p}$ -norm based LDA ( $l_{\mathrm {2,}p}$ -LDA) is proposed, and an efficient Iteratively Reweighted Least Squares algorithm for $l_{\mathrm {2,}p}$ -LDA is presented. Finally, KMp and $l_{2, p}$ -LDA are combined into a single framework to form an efficient generalized DSC algorithm: $l_{2,{p}}$ -norm based DSC clustering ( $l_{2,{p}}$ -DSC). In addition, the effects of the parameters on the proposed algorithm are analyzed, and based on the theory of robust statistics, a special case of $l_{2,{p}}$ -DSC, which can show better robustness on the data sets with noise and outlier, is studied. Extensive experiments are performed to verify the effectiveness of our proposed algorithm.
Highlights
Cluster analysis is a basic method for multivariate statistical analysis, and it is an important part of unsupervised pattern recognition
EXPERIMENTAL RESULTS AND ANALYSIS the performance the poposed l2,p-Discriminative subspace clustering (DSC) algorithm is empirically studied in comparison with the intelligent KM (IKM) [41], DC [17], discriminative k-means (DKM) [19] and discriminative embedded clustering (DEC) [23] algorithms
Theoretical analysis and experimental results show, compared with the DSC algorithm, with p1 < 2 and p2 < 2, l2,p-DSC is more robust to the noise and outlier in data
Summary
Cluster analysis is a basic method for multivariate statistical analysis, and it is an important part of unsupervised pattern recognition. Linear discriminant analysis (LDA) is a classical supervised method for feature extraction and dimension reduction [15] It computes an optimal linear transformation matrix by minimizing the within-class distance of the data set while maximizing the between-class distance in the linearly transformed low dimensional space simultaneously. In some recent work [16]–[18], LDA (or its variants) was combined with clustering process to improve the performance of clustering This method, which is called discriminative subspace clustering (DSC) in this paper, uses LDA to project data onto the optimal transformation subspace while completes the data clustering in the low dimension transformation subspace, and optimizes these two processes alternately to perform the clustering and dimension reduction of data simultaneously.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.