Abstract

A new similarity measure for hierarchical clustering is proposed. The idea is to treat all the data points as mass points under a hypothetical gravitational force field, and derive the hierarchical clustering results by estimating the travel time between data points. The shorter the time needed to travel from one point to another, the more similar the two data points are. In order to avoid the complexity in the simulation using molecular dynamics, the potential field produced by all the data points is computed. Then the travel time between a pair of data points is estimated using the potential field. In our method, the travel time is used to construct a new similarity measure, and an edge-weighted tree of all the data points is built to improve the efficiency of the hierarchical clustering. The proposed method called Travel-Time based Hierarchical Clustering (TTHC) is evaluated by comparing with four other hierarchical clustering methods. Two real datasets and two synthetic dataset families composed of 200 randomly produced datasets are used in our Experiments. It is shown that the TTHC method can produce very competitive results, and using the estimated travel time instead of the distance between data points is capable of improving the robustness and the quality of clustering.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call