Abstract

This article is an addendum to the 2001 paper [1] which investigated an approach to hierarchical clustering based on the level sets of a density function induced on data points in a d-dimensional feature space. We refer to this as the “level-sets approach” to hierarchical clustering. The density functions considered in [1] were those formed as the sum of identical radial basis functions centered at the data points, each radial basis function assumed to be continuous, monotone decreasing, convex on every ray, and rising to positive infinity at its center point. Such a framework can be investigated with respect to both the Euclidean (L2) and Manhattan (L1) metrics. The addendum here puts forth some observations and questions about the level-sets approach that go beyond those in [1]. In particular, we detail and ask the following questions. How does the level-sets approach compare with other related approaches? How is the resulting hierarchical clustering affected by the choice of radial basis function? What are the structural properties of a function formed as the sum of radial basis functions? Can the levels-sets approach be theoretically validated? Is there an efficient algorithm to implement the level-sets approach?

Highlights

  • This article is an addendum to our 2001 paper [1] which considered algorithmic aspects of an approach to hierarchical clustering based on the level sets of a “density function” induced on data points in a d-dimensional feature space

  • The density functions considered in [1] were those formed as the sum of identical radial basis functions centered at the data points, each radial basis function assumed to be continuous, monotone decreasing, convex on every ray, and rising to positive infinity at its center point

  • How does the level-sets approach compare with other related approaches? How is the resulting hierarchical clustering affected by the choice of radial basis function? What are the structural properties of a function formed as the sum of radial basis functions? Can the levels-sets approach be theoretically validated? Is there an efficient algorithm to implement the level-sets approach?

Read more

Summary

Introduction

This article is an addendum to our 2001 paper [1] which considered algorithmic aspects of an approach to hierarchical clustering based on the level sets of a “density function” induced on data points in a d-dimensional feature space. The density functions are taken as the sum of identical radial basis functions centered at the data points, each radial basis function assumed to be continuous, monotone decreasing, convex along each ray, and rising to positive infinity at its center point. This framework can be investigated with respect to both the Euclidean (L2) and Manhattan (L1) metrics, particular attention. It seems to us that there is still an opportunity for further development of frameworks and potential unification regarding these methods of hierarchical clustering and their validation Toward this end, the present addendum puts forth some observations and questions beyond those in [1] about the level-sets approach. To give proper context to some of the questions alluded to earlier, we wish to draw attention to two other approaches in the literature that are related to the level-sets approach in that all three can be viewed as physics-based approaches to hierarchical clustering

Physics-Based Approaches to Hierarchical Clustering
Effect of Different Radial Basis Functions
Comparing the Physics-Based Approaches
Validating the Physics-Based Approaches
Ghost Clusters
Ridge Paths in Euclidean Space
Approximate Level-Set Hierarchical Clustering
Ridge Paths in Manhattan Space
Highest Paths of a Convex Function over a Convex Body
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call