Abstract
Clustering has been troubled by varying shapes of sample distributions, such as line and spiral shapes. Spectral clustering and density peak clustering are two feasible techniques to address this problem, and have attracted much attention from academic community. However, spectral clustering still cannot well handle some shapes of sample distributions in the space of extracted features, and density peak clustering encounters performance problems because it cannot mine the local structures of data and well deal with non-uniform distributions. In order to solve above problems, we propose the density gain-rate peak clustering (DGPC), a new type of density peak clustering method, and then embed it in spectral clustering for performance promotion. Firstly, in order to well handle non-uniform sample distributions, we propose density gain-rate for density peak clustering. Density gain-rate is based on the assumption that the density of a clustering center will be higher with the reduce of the radius. Even under non-uniform distributions, the cluster center in low density region will still have a significant density gain-rate thus can be detected. We combine density gain-rate in density peak clustering to construct DGPC method. Then in the framework of spectral clustering, we use our new density peak clustering to cluster the samples by their extracted features from a similarity graph of these samples, such as the neighbor-based similarity graph or the self-expressiveness similarity graph. Compared with the previous spectral clustering and density peak clustering, our method leads to better clustering performances on varying shapes of sample distributions. The experiment measures the performances of our clustering method and existing clustering methods by NMI and ACC on seven real-world datasets to illustrate the effectiveness of our method.
Highlights
Clustering is widely used to extract potentially useful information in unsupervised learning environment
In order to solve above problem, we propose density gain-rate peak clustering (DGPC) which is enlightened from information gain-ratio, and we embed DGPC in spectral clustering [35]
PRELIMINARIES Our work is based on Density peak clustering (DPC) and two Spectral clustering (SC) methods, we review them
Summary
Clustering is widely used to extract potentially useful information in unsupervised learning environment. Density peak clustering (DPC) emerges as another essential way to handle the problem of varying shapes of sample distributions [36]. Some researchers propose DP-SC [26] This method uses DPC to cluster the samples after feature extraction in SC. Though DP-SC can better mine the local structure of samples than DPC, it still cannot well handle the non-uniform sample distributions in the space of extracted features. Compared with the previous SC methods and DPC methods, our DGPC-SC method can get the better cluster of the samples under nonspherical and non-uniform distributions. The clustering results of this dataset with different clustering methods (k-means, DPC, SC, DP-SC, DGPC-SC) are shown in Fig. 1 (b) - (f).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.