Hybrid clustering of data and vague concepts based on labels semantics

Zengchang Qin,Tao Wan,Hanqing Zhao

doi:10.1007/s10479-017-2541-0

Abstract

Data clustering is the process of dividing data elements into clusters so that items in the same cluster are as similar as possible, and items in different clusters are as dissimilar as possible. One of the key features for clustering is how to define a sensible similarity measure. Such measures usually handle data in one modality, but unable to cluster data from different modalities. Based on fuzzy set and prototype theory interpretations of label semantics, two (dis) similarity measures are proposed by which we can automatically cluster data and vague concepts represented by logical expressions of linguistic labels. Experimental results on a toy problem and one in image classification demonstrate the effectiveness of new clustering algorithms. Since our new proposed measures can be extended to measuring distance between any two granularities, the new clustering algorithms can also be extended to cluster data instance and imprecise concepts represented by other granularities.

Full Text