A Theoretical Analysis of Density Peaks Clustering and the Component-Wise Peak-Finding Algorithm.

Joshua Tobin,Mimi Zhang

doi:10.1109/tpami.2023.3327471

Joshua Tobin, Mimi Zhang

Open Access

PDF Available

https://doi.org/10.1109/tpami.2023.3327471

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Density peaks clustering detects modes as points with high density and large distance to points of higher density. Each non-mode point is assigned to the same cluster as its nearest neighbor of higher density. Density peaks clustering has proved capable in applications, yet little work has been done to understand its theoretical properties or the characteristics of the clusterings it produces. Here, we prove that it consistently estimates the modes of the underlying density and correctly clusters the data with high probability. However, noise in the density estimates can lead to erroneous modes and incoherent cluster assignments. A novel clustering algorithm, Component-wise Peak-Finding (CPF), is proposed to remedy these issues. The improvements are twofold: 1) the assignment methodology is improved by applying the density peaks methodology within level sets of the estimated density; 2) the algorithm is not affected by spurious maxima of the density and hence is competent at automatically deciding the correct number of clusters. We present novel theoretical results, proving the consistency of CPF, as well as extensive experimental results demonstrating its exceptional performance. Finally, a semi-supervised version of CPF is presented, integrating clustering constraints to achieve excellent performance for an important problem in computer vision.

Full Text