Abstract

Hierarchical clustering has been extensively used in practice, where clusters can be assigned and analyzed simultaneously, especially when estimating the number of clusters is challenging. However, due to the conventional proximity measures recruited in these algorithms, they are only capable of detecting mass-shape clusters and encounter problems in identifying complex data structures. Here, we introduce two bottom-up hierarchical approaches that exploit an information theoretic proximity measure to explore the nonlinear boundaries between clusters and extract data structures further than the second order statistics. Experimental results on both artificial and real datasets demonstrate the superiority of the proposed algorithm compared to conventional and information theoretic clustering algorithms reported in the literature, especially in detecting the true number of clusters.

Highlights

  • Clustering is an unsupervised approach for segregating data into its natural groups, such that the samples in each group have the highest similarity with each other and the highest dissimilarity with samples of the other groups

  • By implementing the split and merge clustering, we demonstrate the mean quadratic mutual information estimated at each hierarchy in Figures 4b, 4d, and 4f, in which the errorbar shows the standard deviation for repeating the clustering 10-fold, each originating from a different initial clustering

  • Two hierarchical approaches are proposed for maximizing the quadratic mutual information between the samples of the input space and the clusters, namely the agglomerative and the split and merge clustering

Read more

Summary

Introduction

Clustering is an unsupervised approach for segregating data into its natural groups, such that the samples in each group have the highest similarity with each other and the highest dissimilarity with samples of the other groups. The distribution is estimated using a Parzen window estimator with Gaussian kernels centered on each sample and with a constant covariance This distribution seems superficial and computationally expensive, but exploiting the Rényi’s entropy estimator [13] in a quadratic form as the proximity measure, the mutual information can be estimated from pairwise distances, . Referred to as the quadratic mutual information [14] This proximity measure has been used in an iterative clustering to optimize the clustering evaluation function that will find the nonlinear boundaries between clusters [15]. We propose two algorithms for the hierarchical optimization, the agglomerative and the split-and-merge clustering In the former, at any hierarchy, the two clusters that maximize the mutual information are combined into one cluster.

Distortion-Rate Theory
Quadratic Mutual Information
Parzen Window Estimator with Gaussian Kernels
Hierarchical Optimization
Agglomerative Clustering
Split and Merge Clustering
Experimental
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call