Abstract

The visual assessment of tendency (VAT) technique, for visually finding the number of meaningful clusters in data, developed by J. C. Bezdek, R. J. Hathaway and J. M. Huband, is very useful, but there is room for improvements. Instead of displaying the ordered dissimilarity matrix (ODM) as a 2D gray-level image for human interpretation as is done by VAT, we trace the changes in dissimilarities along the diagonal of the ODM. This changes the 2D data structure (matrices) into 1D arrays, displayed as what we call the tendency curves, which enables one to concentrate only on one variable, namely the height. One of these curves, called the d-curve, clearly shows the existence of cluster structure as patterns in peaks and valleys, which can be caught not only by human eyes but also by the computer. Our numerical experiments showed that the computer can catch cluster structures from the d-curve even in some cases where the human eyes see no structure from the visual outputs of VAT. And success on all numerical experiments was obtained us- ing the same (fixed) set of program parameter values.

Highlights

  • Clustering is the problem of partitioning a set of objectsO o1, o2, on into c self-similar subsets based on available data and some well-defined measure of similarity

  • Our numerical experiments showed that the computer can catch cluster structures from the d-curve even in some cases where the human eyes see no structure from the visual outputs of visual assessment of tendency (VAT)

  • For that matter, it can start from an ordered dissimilarity matrix from any algorithm of that kind

Read more

Summary

Introduction

O o1, o2, , on into c self-similar subsets (clusters) based on available data and some well-defined measure of similarity. VAT reorders the points in a data set so that points that are close to one another in the feature space will generally have similar indices (see the example below). Some versions, such as sVAT [7], reduce the size of R by choosing a subset of the original set O. Their numeric output is an ordered dissimilarity matrix (ODM). The approach of this paper is to trace changes in dissimilarities along the diagonal of the ODM, the numeric output of VAT that underlies its visual output ODI.

Visual Assessment of Cluster Tendency Using Diagonal Tracing
Numerical Examples
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.