Clustering data and imprecise concepts

Weifeng Zhang,Zengchang Qin

doi:10.1109/fuzzy.2011.6007372

Abstract

Cluster analysis is the assignment of grouping a set of observations into clusters so that observations in the same cluster are similar in some sense. One of the key features for clustering is how to define a sensible similarity measure. However, classical clustering algorithms have no ability to cluster data instances and imprecise concepts using traditional distance measures. In this paper, we proposed a (dis)similarity measure based on a new knowledge representation framework called label semantics. Based on this new measure, we can automatically cluster data instance and descriptive concepts represented by logical expressions of linguistic labels. Experimental results on a toy problem in image classification demonstrate the effectiveness of the new proposed clustering algorithm. Since the new proposed measure can be extended to measuring distance between any two granularities, the new clustering algorithms can also be extended to clustering data instance and imprecise concepts represented by other granularities.

Full Text