Abstract

Clustering is a fundamental approach to discover the valuable information in data mining and machine learning. Density peaks clustering is a typical density based clustering and has received increasing attention in recent years. However DPC and most of its improvements still suffer from some drawbacks. For example, it is difficult to find peaks in the sparse cluster regions; assignment for the remaining points tends to cause Domino effect, especially for complicated data. To address the above two problems, we propose generalized density peaks clustering algorithm (GDPC) based on a new order similarity, which is calculated by the order rank of Euclidean distance between two samples. The order similarity can help us to find peaks in the sparse regions. In addition, a two-step assignment is used to weaken Domino effect. In general, GDPC can not only discover clusters in datasets regardless of different sizes, dimensions and shapes, but also address the above two issues. Several experiments on datasets, including Lung, COIL20, ORL, USPS, Mnist, breast and Vote, show that our algorithm is effective in most cases.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.