Density peak clustering algorithms: A review on the decade 2014–2023

Yizhang Wang,Jiaxin Qian,Muhammad Hassan,Xinyu Zhang,Tao Zhang,Chao Yang,Xingxing Zhou,Fengjin Jia

doi:10.1016/j.eswa.2023.121860

Abstract

Density peak clustering (DPC) algorithm has become a well-known clustering method during the last decade, The research communities believe that DPC is a powerful tool applied to various fields underlying distinct contemporary issues and future prospects, it is time to summarize the research progress of DPC and help them quickly know what issues have been resolved, what issues remain open, and what to do in the future. In this survey, we first describe several frequently used synthetic, UCI, and image datasets followed by the reviewing of all the DPC-related works as categorized into: finding clusters with different densities, optimizing parameter values, preventing domino effects, clustering large datasets, implementing parameter-less DPC, clustering mixed data, and clustering imbalanced data. Then, we compare the recently and widely used extensions of DPC based on the 26 synthetic and UCI datasets. Finally, according to the above analysis, the survey concludes with the improvement of DPC on synthetic and UCI datasets, revisiting large-scale data clustering, parameter-less clustering, privacy-protecting based clustering like challenges, proposing solutions on the deployment of DPC in spark, introducing deep clustering to DPC, and finally federating DPC clustering. To the best of our knowledge, this is the first review that summarizes the progress of DPC in the last decade.

Full Text