Abstract
Cluster analysis is concerned with the problem of partitioning a given set of entities into homogeneous and well-separated subsets called clusters. The concepts of homogeneity and of separation can be made precise when a measure of dissimilarity between the entities is given. Let us define the diameter of a partition of the given set of entities into clusters as the maximum dissimilarity between any pair of entities in the same cluster and the split of a partition as the minimum dissimilarity between entities in different clusters. The problems of determining a partition into a given number of clusters with minimum diameter (i.e., a partition of maximum homogeneity) or with maximum split (i.e., a partition of maximum separation) are first considered. It is shown that the latter problem can be solved by the classical single-link clustering algorithm, while the former can be solved by a graph-theoretic algorithm involving the optimal coloration of a sequence of partial graphs, described in more detail in a previous paper. A partition into a given number of clusters will be called efficient if and only if there exists no partition into at most the same number of clusters with smaller diameter and not smaller split or with larger split and not larger diameter. Two efficient partitions are called equivalent if and only if they have the same values for the split and for the diameter.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Pattern Analysis and Machine Intelligence
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.