Abstract

A novel potential-based hierarchical agglomerative (PHA) clustering method is proposed. In this method, we first construct a hypothetical potential field of all the data points, and show that this potential field is closely related to nonparametric estimation of the global probability density function of the data points. Then we propose a new similarity metric incorporating both the potential field which represents global data distribution information and the distance matrix which represents local data distribution information. Finally we develop another equivalent similarity metric based on an edge weighted tree of all the data points, which leads to a fast agglomerative clustering algorithm with time complexity O(N2). The proposed PHA method is evaluated by comparing with six other typical agglomerative clustering methods on four synthetic data sets and two real data sets. Experiments show that it runs much faster than the other methods and produces the most satisfying results in most cases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call