Abstract

The feature distribution of high dimension, small sample size (HDSS) data is sparse, resulting in unsatisfactory clustering results. Dimension reduction methods play an inevitable role in analyzing and visualizing high-dimensional data. It is likely to cause the matrix singularity for subspace clustering when directly reduce the dimension of HDSS dataset. Therefore, we construct multiple data subsets from the original HDSS dataset for ensemble dimension reduction. Projection least square regression subspace clustering (PLSR) which combines projection technique with least-square regression is used as a base dimension reducer for ensemble dimension reduction, called EPLSR. Considering the spectral properties of spectral clustering, we propose the ensemble dimension reduction for subspace clustering based on spectral disturbance (SD-EPLSR) method. According to the theory of spectral disturbance, the weight coefficients are learned according to two principles: 1. The clustering results on each data subset should be close to the consensus clustering result. 2. Data subsets with similar clustering results should have approximate weights. Experiments on eight HDSS datasets demonstrate that our method is effective.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call