Abstract

Density peak clustering algorithms and their variants have achieved promising results in many fields over the last few years. However, most of these algorithms parameters requiring to be fine-tuned by users. When facing real-world data without ground-truths, it is often challenging and time-consuming to identify better parameter values for parametric clustering algorithms. Considering this, we propose a density peak clustering algorithm guided by pseudo labels (PLDPC), in which the manually pre-specified parameters are avoided through applying the mutual information criterion. Specifically, we first design a novel pseudo-label generation method based on the theory of co-occurrence. Then, we use the maximizing mutual information method to obtain better clustering results. To evaluate the effectiveness of the proposed PLDPC algorithm, we conduct extensive experiments on 23 datasets, including six synthetic and seventeen real-world datasets. The experimental results show that PLDPC outperforms three classical algorithms (i.e., K-means, DPC, and DBSCAN) and eight state-of-the-art (SOTA) clustering algorithms in most cases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call