Chapter 16 - Cluster Validity

Sergios Theodoridis,Konstantinos Koutroumbas

doi:10.1016/b978-1-59749-272-0.50018-9

Abstract

This chapter discusses clustering validity stage of a clustering procedure. The chapter presents methods suitable for quantitative evaluation of the results of a clustering algorithm, known under the general term cluster validity. Cluster validity can be approached in three possible directions. First is to evaluate C (where C is the clustering structure resulting from the application of a clustering algorithm on data set X) in terms of an independently drawn structure, which is imposed on X a priori and reflects intuition about the clustering structure of X. The criteria used for the evaluation of this kind are called external criteria. External criteria may be used to measure the degree to which the available data confirm a prespecified structure, without applying any clustering algorithm to X. The criteria used for this kind of evaluation are called internal criteria. Last approach is to evaluate C by comparing it with other clustering structures, resulting from the application of the same clustering algorithm, but with different parameter values, or of other clustering algorithms to X. Criteria of this kind are called relative criteria. This chapter also focuses on the definitions of internal, external, and relative criteria and the random hypotheses used in each case. Indices, adopted in the framework of external and internal criteria, are presented, and examples are provided showing the use of these indices.

Full Text